A Grail Millennium Project
White Paper
This document is maintained by the author, Lewis A. Sellers, somewhere in the mountains of East Tennessee, the United States of America. It is an informal technical document for a works in progress project called Grail Millennium, or fully, The Minimal Operating System of Object Class Interfaces Holy Grail for the Millennium. In other words, for a new, easy to use, long-lived operating system.
Copyright Notice
This document and all material therein, unless otherwise stated, is Copyright © 1995,1996, Lewis A. Sellers. All Rights Reserved. Permission to distribute this document, in part or full, via electronic means (emailed, posted or archived) or printed copy are granted providing that no charges are involved, reasonable attempt is made to use the most current version, and all credits and copyright notices are retained.
Distribution Rights
All requests for other distribution rights, including incorporation in commercial products, such as books, magazine articles, CD-ROMs, and or computer programs should be made to the primary author Lewis Sellers.
Warranty and disclaimer
This document is provided as is without any express or implied warranties. This is a work-in-progress, and as such will contain outdated or as yet uncorrected or substanstiated assumptions. The author, maintainer and/or contributors assume no responsibility for errors or omissions, or for damages resulting from the use of the information contained herein.
WWW Home Sites
You can currently find home sites to this project at If you can not reach them, or they seem to be down, do a key word search on AltaVista, Lycos, or the Web Crawler search engines.
Contact Email
The primary author of Grail Millennium should be reachable at lsellers@usit.net.
OCI/Spec | A S S E M B L E R
DESIGNER
Lewis A. Sellers (aka Minimalist) lsellers@usit.net
CRITIQUED BY
- Anis Ahmad bq689@freenet.carleton.ca (a1)
- Cleo Saulnier p8uu@jupiter.sun.csd.unb.ca (a2)
- Akintunde Omitowoju tunde@housing.east-tenn-st.edu (a4 -> a6)
- Luc Stepniewski stepniew@isty-info.uvsq.fr (a6.04)
- Arto Sarle arto.sarle@dlc.fi (a6.06)
VALID PROCESSORS
- Intel 386, 486, Pentium, Pentium Pro
- PowerPC 601, 604, 620
- Motorola 68020, 68030, 68040, 68060
- ARM610, ARM710, SA-110
DRAFTS
- August 25 1995
- August 29 1995
- Sept 2 1995 (revision 3)
- December 18 1995 -- revision 4
- Jan 9 1996 -- revision 5.1
- Jan 24 1996 -- revision 5.2
- Jan 29, 1996 revision 5.22
- Jan 31, 1996 revision 6.00
- Feb 1, 1996 revision 6.01
- Feb 2, 1996 revision 6.02
- Feb 5, 1996 revision 6.03
- Feb 6, 1996 revision a6.04*
- Feb 13, 1996 revision a6.05 (minor revision)
- Feb 14 1996, revision a6.06
- Feb 15, 1996 Friday, revision a6.07/a6.10/a6.11
- Feb 21, 1996 revision a6.12
- Feb 23, 1996 revision a6.13 (added threads)
- Feb 28, 1996 revision a6.14
- March 5-7, 1996 revision a6.15 (co-op revising for core docs)
- April 1st, 1996 revision a.616 (new computer, new printer... new doc with truespace renderings.
- July 26 1996 revision a6.20 (translation to HTML for 0.11 of website, and rewrite)
(a6.20)
The paper "Of Machine, Mind and Man" describes the fundamental concepts that are put into practice by this lowest level language, the assembler. It is highly suggested that you read both it and one of the the Core Design documents before continuing to assimilate the information presented here.
Grail Millennium for the Intel processor is being written using Borland International's TASM 2.02* in IDEAL mode. For writing Grail programs however, a somewhat purified assembler structure was created which may have people who are use to IDEAL mode wondering if they are suffering from dyslexia.
*Originally it was being written with the TASM 4.0 I had bought, but not everyone in the newly formed CORE group had it. So we compromised on 2.x.Relax.
It is different to be sure, but as I myself (Minimalist) will have to write most of the initial programs for Grail with nothing more than this assembler and a crude editor running under a hardware text CLI Metaphor you damn well know I'm going to make sure it can do the job as quickly and efficiently as possible. GA (the assembler) is specifically tailored for writing applications in the Grail Operating System. You can not assemble DOS, Linux, OS/2 or Microsoft Windows programs of any kind with it.
Further, it is highly recommended that any future high-level Grail languages such as C, C++, Pascal, OO LISP, or BASIC create GA Assembly Source Code and then summon GA to compile it to the standard Object Module Format, instead of trying to create object module files themselves.
Neutral (ie, Freeware) and Natural Versions
As you may be aware, there are two distinct versions of most Grail components, what are generally called the Neutral and Natural versions.
Neutral components are made in-house by the Grail Millennium .ORG for free or are donated by corporate or other entities. They are not allowed to have encryption, compression or other abiltiies that would not allow the components to be exported globally or internationally. In other words, they are legally safe and free for use anywhere.
The Natural (or black) versions of all components are not necessairly free, nor, while some of the members of the project may write them, do they come under the auspices of the Grail Millennium Project Organization. If you are wanting to create a competitive commercial assembler product, you should read the CORE docs to be aware of a few minor restrictions you will have to follow so as not to illegally misrepresent other companies and industries who are registered strategic partners or sponsers. (Ie, so that you don't muck up their unique legal verification certificates).
Multiple Processor Architecture Support
As those who have read the CORE documents are aware, the Object Module Format allows the compiling of code into an object for the support of multiple processor architecture. Grail will run the appropriate processor code and ignore the rest. This allows you to write objects that will run natively on several different processors. No need to download a specific object for a specific processor. Just use it, and GRAIL will handle the rest.
At the moment those target processors are those listed below:
- The i386, i486, Pentium, Pentium Pro
- which includes the IBM PC clones.
- The PowerPC 601, PowerPC 604 and PowerPC 620
- which includes the Macintosh, the BeBox and the ESCOM Amiga.
- The Motorola 68020, 68040, 68060
- which includes the older Macintoshes and the Commodore Amiga.
- The Advanced RISC Microprocessors ARM 610, ARM 710, ARM 810, and SA-110
- which includes the modern British Acorn Computer line.
- And the NMD Virtual Processor Architecture.
--min
Table of Contents
Legal Notices Who gets sued and why History Changes and who made them Preface About GA and the people that make it
Labels The proper usage of labels Symbols Global Directives Line-by-line Directives Conditional Assembly Constants Sections: CODE and DATA Simple Data Strutures Complex Data Structures Object Functions Abstracted Structures MULTI-PLATFORM HIGHER-LEVEL EXPRESSIONS ERRORS / WARNINGS DEBUG INFORMATION WRITING (USER) DEVICE OBJECTS WRITING FILESYSTEM DEVICE OBJECTS ASSEMBLER Object
LABELS
Variables and labels may include A-Z a-z 0-9 _ ? @ ! ~. And yes, GA is case sensitive to it's labels and variables. Case is ignored with all other opcodes and directives.
Labels are text post-fixed by a semi-colon, such as the following example.
This_is_a_label:Labels are used in both code sections AND data sections. They indicate only a position in memory, and are associated with no specific size (ie, void casting).
In data sections you of course can also directly name a variable and this name is then associated with the declared size for that line of data.
Example 1:
dl Suzy_Q 2Susy_Q is associated with a long word (dword).
Example 2:
Susy_Q:
dl 2Susy_Q is not associated with any size at all.
SYMBOLS
In general Grail only reserves the characters NULL (0), and NEWLINE (13) for itself concerning text operations. In the assembler, the symbols < > " @ and , are also used.
<> is used to represent the ascii/unicode of a byte, word or long word (dword) value.
'' is used for bytes, or strings. literal string.
@ is used to derive the current data offset (similar to TASM's '$')
The symbol "%%" prefixes a variable that is considered to be generated by the assembler.
The symbol "#" can be used to expressly declare that the following item is a number. For some assembly language syntaxes this is mandatory. For others it is either optional or unused.
GLOBAL DIRECTIVES
Perhaps I used too much COBOL in college, but this is how the global directives are handled. For some of the directives please refer to the CORE documents for the current definitions. (The CORE documents are MPU.DOC, FS.DOC, VFS.DOC and BOOT.DOC.)
GRAIL/HEADER/OBJECT
It is required to enclose certain parts of your source document with the following keywords. Usually they use the parameter BEGIN to begin the enclosure and END to denote it's ending. If the enclosing directive does not have arguments it is assumed to be BEGIN. The keywords are:
GRAIL
HEADER
OBJECT
and PROCESSOR
GRAIL encloses the entire document.
HEADER encloses the single document header containing the global directives that follow.
OBJECT encloses the data and code of this source document.
PROCESSOR encloses an area that is typically one or more CODE BEGIN/CODE END sections. This directive requires as an argument the cpu that the code is written for. For example:PROCESSOR i386
...allows only intel 386 or less instructions to be used, and tells the assembler how data is to be aligned with the ALIGN arguments. In this case, data is set for 16 bytes paragraph alignment. This defines the PTR/DPTR bitwidths.
If you use floating point operations in your code you must use the keyword FPU directly before you state the processor type. If you follow the PROCESSOR directive directly with the keyword LOCK then you instruct the object flags to be set locking the object into the bitwidth of that processor.
The current valid processor types are:
For example:
- i386NOFPU
- i386
- i486NOFPU
- i486
- iPentium
- iPentiumPro
- PowerPC601
- PowerPC602
- PowerPC603
- PowerPC604
- PowerPC620
- MC68020
- MC68030
- MC68040
- MC68060
- ARM610
- ARM710
- SA110
- vNMD
PROCESSOR LOCK FPU i386
PROCESSOR END
NAMES/NAMES END
TITLE title
The title of the file. Ie, it's filename. Also the object class name if an object (which of course, since this is an assembler, means we're always talking about object class names here. :)
SUBTITLE subtitle
The subtitle of the file or the instance name if an object. Instance names of objects are usually NULL except for device objects.NAME title:subtitle
Instead of using TITLE/SUBTITLE, you can alternately just use NAME to reference both at once. The dividing symbol is a peroid for objects and a semi-colon for all other files. Note that you CAN use a colon for objects as well, but it isn't recommended.
AUTHOR
Your name or the names of the teams that produced the object, are stored in the object module and echoed to the context in the application section. You can specify a plaintext filename here as well using the full URL scheme.
ORGANIZATION
Your company and/or group name. Stored in the object module and echoed to the context in the application section. You can specify a plaintext filename here as well using the full URL scheme.
LEGAL
Indecipherable pseudo-latin jargon goes here. This string is stored in the object module and echoed to the associative branch in the application section. You can specify a plaintext filename here as well using the full URL scheme.
INFO/INFO END
LANGUAGE
Forces text data to be 8 or 16 bits.
ISO 5889-1 (or ISO LATIN-1)
ISO (or UNICODE)
TIMEDATE timedate_stream
This stream is used to generate the conception date which is stored in the object structure and echoed to the associative branch.
CERTIFICATE_ID
This is a 32-bit number. Should be 0 if using the freeware version of the assembler, or 1 if third-party.
CERTIFICATE_VERIFICATION
A stream of data. Usually a file. Sometimes a string in the source itself.
BUILD.OBJECT (or BUILD)
The revision number, or internal build, for the object itself.
BUILD.STRUCTURE (or BUILD)
Revision number of public structures.
USAGE
See the CORE documents for further information under OBJECT MODULE FORMAT.
- Public Domain
- Freeware
- Shareware
- Commercial
- Site Commercial
- Restricted
- Government
QUALITY
See the CORE documents for further information under OBJECT MODULE FORMAT.
Quality
- In-house Alpha
- Alpha
- Beta
- Wide Beta
- Gamma
- Final Gamma
ORIGIN
See the CORE documents for further information under OBJECT MODULE FORMAT.
- OEM
- Vendor
- Grail Team
- Grail Associates
- Hobbyist
- Professional
OBJECT/OBJECT END
This section defines hard attributes of the object which will have profound impact on how it operates.
TYPE
What kind of object is this? There are essentially six types of object.
Active
Supervisor
Device
User
Passive
Supervisor
Device
User
STACK numeric
The requested stack space, in bytes (aligned), for this object. If STACK is not specified then no stack space is given. The stack is treated much like the uninitialized data area (very much like it). It is allocated at run time and is valid only for that particular object.
You can use a postfix of k or m to signify kilo-bytes and megabytes, such as
STACK 8k
INSTANTIATION
Can your object have multiple instances of itself? For most objects this is yes, but for device objects and the core set this is a qualified no. By that, it is meant that there are two kinds of no to be dealt with. An explicitly instantiatied object can have multiple instances only if those instances are named within the object. In other words, you can have the instances of __FSDEVICE SCSI and EIDE in memory at the same time because are explicitly instiated. As for the core objects, there can only be multiple instances of them so long as the instances are for disparate processors.
So, in summary, we can have:
No, core
No, explicit
Yes
SECURITY
There are four levels of security:
Supervisor
Secure (Private User)
Public (Public User)
Hermetic (Hermetic User)
TASKWhat kind of multi-tasking, if any, are we interested in for this object? This is only valid if the object is an active instance (ie, contains a main function). This is ended by the directive TASK END.
CLASS PRIORITY 04 Event 00 SCHEDULER 03 Invocation 01 CoreCritical 02 Timer 02 Core 01 Adaptive 03 FileSystemHigh 00 Fairturn 04 FIleSystemLow 05 RealtimeDeviceHigh 06 RealtimeDeviceLow 07 ObjectHigh 08 ObjectLow 09 DeviceHigh 10 DeviceLow 11 ThreadHigh 12 ThreadModest 13 ThreadLow 14 Idle 15 Sleep
OBJECT
This is the exact same format as TASK. It explicitly targets the object itself.
THREAD
This is the exact same format as TASK. It explicitly defines the default for all threads of the object.
LIST/LIST END
In this section you can declare a list of objects to automatically request and discard by scope. More convenient than doing the request/discards yourself. It assumes that a return of greater than 0 is success and less than 0 is failure. For example:
LIST
"TIFF"
"GIF"
"JPEG"
LIST END
FILES/FILES END
All the following directives are in the enclosed FILES/FILES END structure.LINK
Would you like for a command interface link to be automagically created for you? If so, specify the folder path here. The link is placed in USER://Command Interface/, etc.
FOLDER
By default, the compiled object is placed in the same folder as the source code. This allows you to specify either an alternate absolute folder, a relative folder or a virtual folder (ie, temp_folder, etc).
ASSEMBLY_LANGUAGE
This directs the generated phase1 assembly language source code out to a specific file. This source is created after all macros, parsemacros, and other high level constructs have been evaluated and converted into the lowest common denominator of assembly op-codes.
MACHINE_LANGUAGE
Not the most useful of things, but this directs machine level (ie, DEBUG.COM quality) source to a file which can be printed out and generally be used to paper several walls very nicely even with small programs.
DEBUGGING
If specified then debugging information will be generated and stored in the object file and sent to the report file if it is used. By default, the single keyword enables this directive but for clarity you can also use the keyword IGNORE and USE. Normally the debugging info will include the extrapolated text assembly code (only). By using the keyword DETAILED all the text source will be packed into the debugging info, including remarks and blank lines.
ERRORS
If specified then warnings will be generated and stored in the object file and sent to the report file if it is used. By default, the single keyword enables this directive but for clarity you can also use the keyword IGNORE and USE.
WARNINGS
If specified then warnings will be generated and stored in the object file and sent to the report file if it is used. By default, the single keyword enables this directive but for clarity you can also use the keyword IGNORE and USE.
REPORT report_output_file
Names the file to output the text report of the assembly to, including all errors and warning. If not specified then no report is generated.
LINE-BY-LINE DIRECTIVES
The following directives modify how an assembly proceeds on a line per line basis unlike the global directives.
%INCLUDE "name"
Includes SOURCE/OBJECT/ASSEMBLY. Don't forget that Grail does not use dot extensions such as ".ASM" or ".inc". At least it doesn't have to. Generally assembly source is DATA type, NULL subtype, and in the system associtives it is defined as "TEXT/ASCII/SOURCE/ASSEMBLY".
%BINARY "name"
Binary include. Includes the file as a raw block of data (DATA/DATA END). For Example:
Future_Crew_Logo:
%BINARY FCLOGO.GIF
%% ... %
Comments. Use ; or %% ... % for comments.
%JUMP
Concerns CODE sections. Uses short jumps/branches (8-bit short Bcc.S on intel) where possible instead of long ones (near 32-bit Bcc.L on intel).
%NOJUMP
Always use the long jump/branches (32-bit Jcc on Intel Processors)
%OPTIMAL
Substitutes instruction mixes that are functionally the same, but faster. Or at least tries. :) Careful with this. It can thrash your painfully ordered code. (Not that this matters anymore with the P6 designs.) Note that this directive is ignored in the free version of the assembler. :(
%NONOPTIMAL
The default.
%RADIX 2,10,16 (b,d,h)
Tells the assembler whether by default you use hexidecimal, decimal or binary.
%ASCII
Forces all text data following to be 8-bits.
%UNICODE
Forces all text data following to be 16-bits.
%NOASCII or %NOUNICODE
Force all text data to be the default as defined into the header section.
%PTR n
Overrides the default pointer size used. Using it without an argument changes to the default.
%FSPTR n
Overrides the default pointer size used for handling filespaces. Using it without an argument changes to the default.
%ASSC class_struct, var1, var2, ..., varn
As part of a6.07, GA introduced several mechanisms to allow more efficient MACROs to be written. The %ASSC directive is mostly intended to allow a variable reference to be associated with a specific class structure during assembly.
%NOASSC var1, var2, ..., varn
This removes the effects of %ASSC.
%ALIAS alias=variable
This serves two purposes. The first is the most straight forward. You can make an alias for a variable such as
%ALIAS bob=x11
You have not actually allocated any memory for bob, but any time a variable bob is encountered from now on, it will refer to x11. The other use of the %ALIAS directive is to allow you to give a register a specific quality such as being a 16.16 fixed point variable. It would thus be treated as such by all the higher level constructs. For instance:
%ALIAS xn a_fixed_point=EAX
%NOALIAS alias
Removes an alias.
CONDITIONAL ASSEMBLY
%EQU name value
Equate a pre-assembly constant to a specific value. Example:
%EQU TRUE 1
%ENUM s1,s2,...
A incrementally serial %EQU function.
enum NORTH=0,EAST,SOUTH,WEST
%NOEQU name
Unequates a variable.
%IF/%IFDEF equ [value1, value2, etc]
;code
%ENDIF
This, and %IFNDEF, are used to conditionally include source code for assembly. Simply put, this allows you to change one %EQU statement and cause several sections of code to be commented out or included. Mostly used for debugging or using the same object source to compile different sections of code depending on processor types.
%IFN/%IFNDEF equ [value1, value2, etc]
;code
%ENDIF
Assemble if not.
CONSTANTS
%BITWIDTH name
A constant that gives the bit width of an element of a dbit field. Example:
mov eax, %bitwidth player_status.facing
%SIZEOF name
Returns the size in bytes of a STRUCT or of all the variables on the same line as a name. For example, this bit of assembly would return 3:
data
bignosedcat db 1,13,95
data end
code
mov eax, %sizeof bignosedcat
code end
You can also use sizeof to get the size of data types, such as: mov eax, %sizeof dn.
AT-ASSEMBLY INFORMATION
Also available to an object are a set of values computed at assembly time. The following integer values are available to be substituted as constants whereever encountered.
%LinesObject
%LinesFunction
%Line
%LineofFunction
%BytesObject
%BytesFunction
%Byte
%ByteofFunction
CUSTOM CONSTANTS
For the most part, the assembler is has a complete list of all the basic required equated constants. If you look in it's associative branch, application area under "DEFAULT CONSTANTS" you can override some of it's assumed constants however. It also keeps a "DEFAULT CONSTANTS" in the user area.
For example you might add this to it's area for your user account:
[DEFAULT CONSTANTS]
%EQU NEWLINE 13
%EQU NULL 0
%NOEQU TEST
CODE / DATA SECTION
Grail uses a robust structure called the Object Module Format (see the CORE document) to store all of the information generated at assembly time. Among other things, it allows multiple sections of code or data to be stored therein, each with it's own particular attributes. At load time, all the code and data is recomposed, leaving out all extraneous information.
Code generally is only stored in separate sections when code for different processor architectures is involved.
Data, grouped separately from code, may be divided into a several types of sections, including, but not limited to initialized, uninitialized, and endian-sensitive data sections. Data in each structure, if it contains even just one initialized variable, is included in the initialized data block. Uninitialized data in a STRUCT is actually filled with zeros currently, but do NOT count on this always being true.
Uninitialized data that is not in a STRUCT is squeezed out of the rest of the data block and placed at the end of the data block in the uninitialized block. The size of this area is computed and saved in the header as the variable UNINITIALIZED_LENGTH. When the application is loaded this extra amount of space if allocated with the rest of the data block.
There is one very important thing to remember. As anyone who has also read the CORE document knows, Grail will attempt to convert between bitwidths of objects in the same instruction architecture (ie, from a 32-bit PPC601 to a 64-bit PPC620). Unless you specifically declare an object module to be restricted to a specific bitwidth, all pointer and pointer derived data statements as a well as JMP, CALL, etc code instructions may be promoted/demoted to the bitwidth used within a specific family of processor architecture.
DATA SECTION
Where the data is.
DATA
NONALIGN
ALIGN
STATIC
READONLY
ENDIAN n
FLUSH
SHARE
DATA END
The default is NONALIGN. A data section is terminated by the keyword, DATA END. It uses NULLs for padding.
STATIC
If you postfix a STATIC after DATA then you can be assured that your data will stay WHERE YOU PUT IT RELATIVE TO EVERYTHING ELSE. If not, then uninitialized data will be moved away automatically on a line per line basis. In other words, GA evaluates each line in DATA/DATA END. If all data on that line is uninitialized it as marked as reallocatable unless the section is declared STATIC and unmoving.
ALIGN
The ALIGN subdirective does two things. It forces the beginning of the data section to be aligned to what is typically the width of memory moves with the processors bus-system. That is, for a 486 or Pentium, about 16 bytes. It also aligns each individual element so that misalignments do not happen. That is:
db ?
dl ?
could possibly cause a slow-down because the dword is not on a dword boundary. ALIGN will fill in three bytes before the dword.
NONALIGN
Using the NONALIGN subdirective forces a section to byte alignment.READONLY
If you declare a data section READONLY then it is assumed you will not try to change any of the information therein. The benefit you receive from having a read-only data section is that it is reusable by multiple instances of an object. All data that is not READONLY must have a separate copy of such for each instance in existence. If you have a lot of data this can quickly take up a lot of memory. You may use the alignment subdirectives in combination with this.
ENDIAN
By default on the Intel x86 processors the bit order of data is assumed to logically being from 0 to n, just as memory is ordered. You may explicitly declare this with ENDIAN LITTLE. You may also cause data to be written in reverse order with ENDIAN BIG. Note that when you use the PROCESSOR directive that the default for the processor is selected, ie LITTLE ENDIAN for the Intel x86s.
You may also use the processor family name (if you have the commercial version of the assembler) and it will handle this for you. For example: ENDIAN PowerPC.
Any data section that declares what it's endianness should be is clumped together in the object module. If you have written an object with code for multiple processor architectures, then the object module load routines will automatically change endianness of any such specified data areas. This allows you to generate common data areas that may be used by various processors without redundancy. To use the PROCESSOR default use the keyword ENDIAN without further arguments. Please note however that you should kept these endian sensitive areas to a minimum if possible.
PERSISTANT
The subdirective PERSISTANT can not be used with READONLY. It forces the section of data to be written back out to the object (in filespace) when the object is dereferenced. You can thus store default values in this area and update them as the object is terminated. You might for example set up a DIGITALAUDIO device object such as:
DATA PERSISTANT
d.16 Port 220
d.8 DMA 1
d.8 HDMA 5
d.8 IRQ 5
DATA END
If you notice, once persistant data is changed, the original states are not retrievable (within referencing the prototype object, if any.) You can give persistant data a or default state which can be called up..
DATA PERSISTANT DEFAULT
d.16 Port 220
d.8 DMA 1
d.8 HDMA 5
d.8 IRQ 5
DATA END
...and change any of these variables by an __OBJECT.Flush have this changed data written back to the object file.
SHARE
Share simply means that if any ascii or unicode data lines occur in another sharing data structure that they should combine and share the data to save space.
PRIVATE
Under normal circumstances data and code in Grail may either be expressly public, expressly private or...in a state of addressable obscurity that's come to be known as the neither-state. If you have not expressly declared data to be public or private, it is probably available, but it's current address is not readily decernable. If you declare a section of data to be PRIVATE however, an attempt will be made to it completely unaddressable to all other objects as long as the object is a Supervisor or Secure User object. PRIVATE is ignored is the object is Public User or Hermetic.
PUBLIC
Is a section of data is declared PUBLIC, then all references therein are placed in the public data section of the Object Module Format and the data is assumed to be physically available.
CODE SECTION
Where all code goes, silly.
Unlike the DATA SECTION which pads space with NULLs, the CODE section uses NOP op-codes.
You can not use data statements within code sections as you might in other assemblers. All declarations of data such as
ptr test_ptr
will push a pointer variable onto the object/threads stack. For the most part, all such declarations are made at the beginning of the function. If is not recommended, but you can use the keyword un. within a code section to pull data from the stack such as
un.ptr
which would compile to
add esp, 4
The code
un.ptr EAX
would compile to
pull EAX
CODE
PRIVATE
PUBLIC
;code
CODE END
PUBLIC
Public overrides all function directives to the contrary.
PRIVATE
Private overrides all function directives to the contary.
SIMPLE DATA STRUCTURES
The available data structures for the assembler are, to say the least, rather robust. Don't let this scare you off. Uninitialized data is represented by a "?" when the value normally would be found. Alternatively you can just terminate the command with a newline. For example:
DB ?
and...
DB
are equivalent.
It should be noted that the assembler handles ' and " symbols more intelligently than some assemblers for other platforms. The ' and " symbols are allowed in variables names so that the variables x' and x" are possible.
PTR
Same as DL a 32-bit near ptr, under Intel x86. A 64-bit pointer under PowerPC 620, etc.
DPTR
This reserved a number of bytes equal in size to the machine natural pointer size, which on intel 386 to p6 is 32-bits (dword). Usually used to hold length or size data in association with pointers.
FSPTR
The same as a PTR, but for filespace. It's best of PTR and FSPTR are the same.
FSDPTR
The same as DPTR, but for filespace.
ASCII
Equivalent to D.8, but forces any text to 8-bits.
ascii,NULL
UNICODE
Equivalent to D.16, but forces any text to 16-bits. For example:
unicode,NULL
Note that all things on this line are extended to 16-bits, including the NULL.
TEXT
This is either ASCII or UNICODE depending on the language section is setup in the header.
text,NULL
D.bitwidth [name] value
Allows you to specify the exact bitwidth of the data element you are using. If it is not supported, you receive the next highest one that does. It is allowed to use multiples of smaller elements to build to a data size. Take these for example on an intel machine:
D.24 ;gives three bytes (truecolor)
D.32 ;gives a four-byte dword
D.4 ;rounded up to a byte
DS [name] value
The shortest data type available on current processor. Usually an 8-bit byte.
DL [name] value
The largest bitdwith item supported by the processor architecture. 64-bits on an x86.
DN [name] value
The largest natural bitwidth item supported by the processor. A 32-bit dword on Intel x86 processors.
F.exponent.sigificand [name] value
Floating point.
FS [name] value
FL [name] value
FN [name] value
X.bitwidth [name] value
Fixed-point.
XS [name] value
XL [name] value
XN [name] value
DB [name] value
Data byte.
DW [name] value
Data word.
DL [name] value
Data double word or long word.
DP [name] value
Data paragraph. 16 bytes on Intel.
DPG [name] value
Data page. 4096 bytes on Intel.
Dt.varequ
You can have the size of a data area determined by constants, equates (%EQU) and memory lookups. The memory lookups may give erratic, possibly fatal, results if the memory area is declared uninitialized. You can specify the type, by the way. By default, without a type, it is assumed you are talking about bits (ie, "D"). You can expressly use bytes ("DB") or words ("DW"), etc.
For instance:
%EQU XLENGTH 320
%EQU YLENGTH 200
D.XLENGTH*YLENGTH*8 buffer would allocate 64000 bytes.
COMPLEX DATA STRUCTURES
CLASS STRUCTURE
CLASS STRUCT name
list...
STRUCT END (or ENDS)
Unlike the STRUCT/ENDS construct, this creates a virtual class or named template. To create an actual physical structure such as this you either use the CREATE/DESTROY data instantiation pseudo-instructions (to create/destroy it at run-time) or use it's name prefixing a new variable name (to create at assembly time). Both those these methods assigns the internal struct to the new variables so that you may refer to the internals using indirection, etc such as:
CLASS STRUCT x
db bitwidth
d.32 ID
db.512 sector
STRUCT END
x newx1
x newx2
....
newx1.bitwidth=32
newx1.ID=7
;do other stuff
STRUCTURE
STRUCT name
list...
STRUCT END (or ENDS)
You can use this to create complex multi-data structures
UNION
The union directive allows you to use several data structures that use the same physical space. Union areas always remain together and stashed in the initialized area. You can create the union of several different single data types, or use the ELEMENT/ELEMENT END subdirectives to create more complex multi-data union structures.
UNION name
ELEMENT name
list...
ELEMENT END
UNION END
An example of a simple union would be:
UNION
db a
dl b
STRUCT END
An example of a complex union would be:
UNION pixel
ELEMENT indexed
d.8 index
ELEMENT END
ELEMENT rgb
d.8 red
d.8 green
d.8 blue
d.8
ELEMENT END
ELEMENT rgba
d.8 red
d.8 green
d.8 blue
d.8 alpha
ELEMENT END
STRUCT END
DBIT
DBIT.alignment name
bits name default
DBIT END (or ENDD)
Specifies a variable bit field which can be allocated in blocks of 8 bits. Defaults to 1 byte. The alignment variable directly specifies how many bytes should be used. Each element is made of three parts, the bits given to the element, the element name and it's default value. For example:
dbit.4 player_status
1 alive?
2 facing ;0=north, east, south, west
1 god_mode?
dbit end
ARRAYS
Multiples offset by size of a class structure or a simple data structure. For example
db[16]
or
db.32[16]
or
track[64]
or
IRQdescriptor[16].base=13
OBJECT FUNCTIONS
All objects contain functions. There are three functions that are special and they are called Creation, Destruction and Main. All objects are required to have the Creation and Destruction functions. If these functions are not found in the source of an object then GA automatically inserts NULL versions of them. The function Creation is run after a __OBJECT.Request. The function destruction is run before the __OBJECT.Discard. If an object has a function called Main then the __OBJECT.Initiate function can be used to make it a separate running task.
All functions in your object are called via the CALL op-code and their name. If the function is in another object you must first get a pointer to that function by calling the __OBJECT.Function function, such as:
__OBJECT.Function MY_OBJECT, function1
This would return in ESI a pointer directly to the function called function1 with which you could do a:
CALL ESI
Note that, if you didn't already know, all object function prefixed with two underlines and in capitals are core functions. Refer to the CORE documents for more info. All core functions are actually protected behind CALLGATES and thus you do not need to use __OBJECT.Function to get their pointers -- they are inherent in the CORE structure.
FUNCTION SCOPE
GA automatically generates the code to make objects and data go out of scope when the main function comes to it's end. See for example the full scale demo FIRE.ASM.
AUTOMATIC SCOPED VARIABLES
You can temporarily create variable structures in your stack by declaring them within a function, before any code.
func test
d.32 n 100
while [n]
dec.32 [n]
endw
end
This is converted to:
mov ebp, esp
push.32 100
while00001:
cmp [ebp], 0
bne.s while00001e
dec.32 [ebp]
jmp while00001
while00001e:
add esp,4
FUNCTION
Functions begin with the keyword FUNC and end with the keyword END. The keywords and the meaning of its modifiers follows.
FUNC name parms...
(arg1, ...)
PUBLIC
USER=n
SUPERVISOR
USES
ALIAS
PARAMETERS
DESCRIPTION
;code
FUNCTION END (or END)
(arg1,...)
Specifies the arguments that are to be used and their sizes. By default the stack is used for arguments, but each arg can be defined to use a specific register for variable passing. For example, instead of:
FUNC PLOT (X Y)
;code
END
you could have:
FUNC PLOT (X=EAX Y=EDX)
;code
END
Any argument that is not associated with a register, is pushed/poped and referenced via EBP and offset. It is assumed that using parentheses after the function name indicates the arguments.
PUBLIC
If you wish the function to be public and show up in the public functions list, then you should use the keyword PUBLIC before the keyword FUNCTION.
USER=n
If a function requires supervisor level privilege and is part of the core set, then you should use this keyword. Among other things, this causes data to build a CALLGATE descriptor to be emited to the Object Module Header. When such a core object is loaded, it creates the CALLGATEs that are to be added to the LDT on the X86 processors.
The exact number of the core function is specified by assignment and may be a number from 1 to 1000. This is defined by The Grail Project exclusively. In some sense this is like declaring the function to be public.
SUPERVISOR
This overrides the PUBLIC keyword, and places the function name is the PUBLIC SUPERVISOR List. Only objects running in supervisor mode can access this function.
USES regs,...
Automatically does appropriate push/pulls.
ALIAS function
This tells the assembler that this function is an alias for another sibling function. Use the ALIAS directive if two functions are identical in their code so as to save space and effort.
PARAMETERS/END
The Object Module Format allows you to define included text that describes the values of each parameter for each function. While this can be used by a higher-level language currently the only language that makes use of this is Grail Script. It also has some use as a debugging tool.
And as ever, here of course is an example:
FUNC Color_the_Screen color, border
PARAMETERS
ELEMENT color
Black=0
Red
Blue
White
ELEMENT END
ELEMENT border
None=0
Thin
Wide
Ornate=7
ELEMENT END
PARAMETERS END
;code here
FUNC END
DESCRIPTION langcode, string
You can provide a NULL-terminated string, if you wish, which describes what the current function does. The string will be embedded into the object module format and can be used by any programmer that uses your objects. By convention the Main function is considered the overview of the entire objects purpose.
You can also supply a file name that will include the text.
You can use multiple descriptions, one for each langcode. This would for instance allow you to include descriptions of a function in English, Finnish, Russian, German, etc at the same time. Extremely useful as debugging or sharing code between large teams.
THREAD
All threads begin with the keyword THREAD and end with the keyword END. In most aspects a thread is the same as a function, and uses all of it's same modifiers, except in its used of arguments. A thread is always passed just one, which may be ignored, interpreted as a a pointer or as a pointer-width variable. The variable is always passed by register, on the x86 that register being EAX. The name of the argument is typically "unique". For example:
THREAD Renderline (unique)
;code goes here.
THREAD END
STACK
The one major difference between a THREAD and a FUNC is that THREADs allocate their own stack space. This operates exactly like the OBJECT stack declaration. By default, if no stack is defined you are currently given 4k. This may change at any time.
ABSTRACTED STRUCTURES
IF/ELSE
IF expression
;code...
IF ELSE (or ELSE)
;code...
IF END (or ENDIF)
This is the standard if/else conditional construct. It allows processor-neutral conditionals to be expressed and contained.
WHILE
WHILE expression
PREEVALUATE/PREEVALUATE END
;code...
WHILE END (or ENDW)
This is the internal while/endwhile macro. It can do some fairly high-level evaluations through the use of PREEVALUATE keyword.
This is an example of the simple WHILE construction.
ECX=0
WHILE ECX<2000
ECX++
WHILE END
We initialize ECX to 0 and count to 2000. The expression to evaluate is directly after the WHILE keyword as it is with most current assembly syntaxes.
Here is the more complex construct. Because we are involved with a more complicated process to derive the variable that must be tested a new keyword is involved. The PREEVALUATE section contains the code to derive the conditional variable. It is executed before the expression is evaluated, always. The PREEVALUATE may be abbreviated as PRE.
keywait = objectfunction "KEYBOARD.KeyWaiting"
ECX = 0
WHILE EAX==0
PRE
call keywait
PRE END
ECX++ ;twiddle our thumbs
WHILE END
DO/WHILE
DO
;code...
PREEVALUATE/PREEVALUATE END
WHILE expression
The DO/WHILE construct is essentially the same as WHILE/WHILE END except that the evaluation comes at the end.
LIST
LIST expression
cmp func
cmp func...
LIST END
Creates a structure which loads EAX, AX, or AL with a value. This value is compared to everything in the list. If there is a hit, then the (single) operation on that line is executed. Unless it is JMP opcode, a jmp to the end of the list is also compiled in.
You can use the keyword break to force a branch/jump out of the list.
The expression can be any register or defined memory location. If you indirectly load the expression from memory you need to postfix a .b,.w or .d to the list, such as LIST.D [ESI]. For example:
%EQU ESC 27
%EQU RETURN 13
LIST [list_option]
ESC jmp escape_from_this_stupid_game
RETURN call do_ops
'0' call func_0
'1' call func_1
ALT_Q ret
LIST END
REPT
REPT n
list...
REPT END
Repeats everything in the list n times. For example:
REPT 4
stos.l
REPT END
...compiles the following code:
stos.l
stos.l
stos.l
stos.l
FOR/NEXT
FOR init, until, increment
list...
FOR END (or NEXT)
Repeats everything in the list n times. For example:
FOR y 1,200,1
FOR x 1,320,1
[baseaddress+x+y*320]=0
NEXT
NEXT
MACRO
MACRO name list...
body
MACRO END (or ENDM)
With a6.07 came about several new modifications that expanded the abilities of the MACRO body. You still of course, define a MACRO with an optional list of parameters. At assembly time all the parameters are directly substituted. Here is one example, which might be familiar to many of you who have ever tried fire algorithms.You might use this as:MACRO avg_point &src clr eax mov bl,[&src] add eax,bl mov bl,[&src-1] add eax,bl mov bl,[&src+1] add eax,bl mov bl,[&src-320] add eax,bl mov bl,[&src+320] add eax,bl MACRO END
avg_point esiVARIABLE SUBSTITUTION: THE AMPERSAND
You can use the variables passed to a macro within the macro as a subsitute. These substitutes must however be preceeded by an ampersand ("&") for within the macro.
ASSIGNMENT AND CONDITIONAL EVALUATION
You can assign a variable with a result from a MACRO such as
result = isprime 131
The question you may ask is how do we know what register or memory location is to be given to the variable result (which is a location in memory)? Each processor has a register that Grail assumes as it's default conditional register. For the x86 this is EAX. You can expressly define the assignment register or override it by making an assignment to the ampersand itself, such as
&=EAX
ASSOCIATING CLASS STRUCTURES TO VARIABLES
Typically you use the &ASSC directives within MACROS almost exclusively. As a prime example, assume we want to make a simple C++ like new function. We might write that as the following macro.
MACRO new &class_struct
%ASSC &class_struct x
CORE __MMU.Allocate %sizeof &class_struct , 0, 0
& = EAX
MACRO END
We would use it like this.
xhandle = new x
Perhaps a better example might be this in the data section
CLASS STRUCT x
db bitwidth
d.32 ID
db.512 sector
STRUCT END
And this as the code
xhandle=new x
xhandle2=new x
x.bitwidth=32
xhandle.ID=7
;do other stuff
delete xhandle2
delete xhandle
PARSEMACRO
PARSEMACRO name
DATA
;body
DATA END
CODE
;body
CODE END
PARSEMACRO END
Introduced in a6.13, PARSEMACRO is a sibling of the MACRO, capable of emiting data and code sections based on the parsing of a command line. One straight-forward example would be
print "Hello World."
Yes. That is proper in assembler. The parsed line would in the first assembly pass create something similar to
DATA SHARE READONLY
P0000000001:
ascii "METAPHOR.TextWindow_Print", NULL
DATA END
DATA ALIGN READONLY
P0000000002:
ascii "Hello World.", NULL
DATA END
and the following actual MACROish code where it was encountered
mov esi,P0000000001
CORE _OBJECT.Function
mov esi,P0000000002
call edi
The actual structure that is used to define the parsed macro print isPARSEMACRO print DATA SHARE READONLY &1: ascii "METAPHOR.TextWindow_Print", NULL DATA END DATA ALIGN READONLY &2: &enclose &skip ascii &close ascii NULL DATA END CODE mov esi,&1 CORE _OBJECT_Function mov esi,&2 call edi &=EAX CODE END PARSEMACRO END
The parse macro must be used outside of CODE or DATA sections.
THE AMPERSAND CONTROL CHARACTER
The ampersand is essentially a control character within a parse macro. The ampersand by itself defines what register is used by assignment or conditional evaluations. It can also be used to as a label local only to the parse macro definition.
Thirdly, and most complex, it is used to parse the command line into a data section. After the parse macro keyword and a space, all the remaining characters on the line are emitted into the data section by selecting a character that you should scan to. With the print example above we to scan to the first quotation, skip it, then emit everything till the next quotation, then add a NULL to the data.
If you do a scan without defining the data type that should be emitted, then that data is skipped (discarded).
The PARSEMACRO and MACRO pass arguments on as &1, &2, &3, etc. You can use these to specifically target an argument. &... indicates all arguments.
You are not limited to ascii data types. You can in fact emit the data into any data type. The string will automatically be converted to a binary numeric if necessary.
&skip
Skip one character.
&&
The very next character following the && pair is scanned for.
&colon
Scans for a colon ":".
&separator
Scans for a separator such as a space or a comma.
&space
Scan for a space.
&end
Scan to the end.
&enclose
Scan for the enclosing quotation.
&close
Scan for the closing quotation.
ATOMIC
ATOMIC
;code
ATOMIC END
While the usefulness of this construct is something to be debated, you can nevertheless enclose a very small number of instructions within an atomic construct. All atomics are to be assumed to execute without interruption from other sources. For some implementations of the core on some processors this does absolutely nothing, other than provide explicit commentary.
MULTI-PLATFORM HIGHER-LEVEL EXPRESSIONS
GA supports processor-independant construction such as assignments and indirection. GA is capable of some rather useful assignment constructions which are both evaluated and produce code at assembly time including:
Addition "+"
Subtraction "-"
Multiplication "*"
Division "/"
Logical Shifts left and right "<<", ">>"
Assignment "="
Indirection "->"
Negation, "!"
Boolean Truth Comparisons Equals "==", Not Equals"!="
Grouping, "(" and ")"
For example, on the x86 processors, you can use the simple assignment:
al = 0
which is assembled into
mov al, 0
(or sub al,al if optimal assembly is engaged).
The indirection
esi->GUTC.Millisecond = 0
is assembled into
mov [esi+8],0
You can even use more complicated expressions such as (assuming p is esi):
%EQU y 320
al = ([p-1] + [p+1] + [p-y]+[p+y])>>2
which might will produce the following (%unoptimal) codepush ebx push eax clr eax clr ebx mov al,[esi-1] add ebx,eax mov al,[esi+1] add ebx,eax mov al,[esi-320] add ebx,eax mov al,[esi+320] add ebx,eax pull eax shr ebx,2 mov al,bl pull ebx
The indirection pointer will only work with class structures.
OFFSETS AND THE BRACKETS
To allow more natural looking equations to be formed, it is assumed that any high-level assignment (ie, if it involves an equals sign) will never work with offsets, only with the values stored at a variables address. Thus, you can write
[x] = [susy]+[sissy]+[yazzie]
or
x=susy+sissy+yazzie
This is only valid with higher level constructs. Brackets must be used with straight assembly op-codes.
ERRORS / WARNINGS
If on the off chance your code generates errors :) they are all shunted to wherever you specify in the header unless your policy settings override this. You generally have the choice of ignoring them, creating a text file with the output and/or having them meshed with your debugging information.
DEBUG INFORMATION
The debug information for Grail is in a token/length format. ALL of the source code is included in the debug info. If there is debugging info then space is allocated for it and it is loaded into memory. Grail does nothing else with it, besides let it take up space in memory. You must have a debugger that will make use of this information. If you use the proper object policy you can force Grail to strip this out of the object module as it loads it so as to save memory.
Debug information consists of a line number (d.32), a pointer to the beginning of the code that was compiled for that line, if any, and a NULL string containing the text source for that line. There then follows another string of text that is computed/generated at assembly time.
WRITING (USER) DEVICE OBJECTS
When writing any objects that deal with hardware devices you must explicitly declare that it is a device object. This has two affects. One is that the assembler does not complain about I/O instructions. The other is that the object is not allowed to have multiple instances.... normally. To use a device you must go through it's device object, which essentially is that devices custom manager.
(That is, customized to that specific device model, or revision).
Please note that Filesystem Device Objects are tightly coupled with the GRAIL CORE (in fact, they are considered part of it). They are not the normally encountered user device objects. They require the professional assembler to create them. Read the section on Filesystem Device Objects.
An incomplete shell of an object that handles I/O with the IBM PC CMOS is available as the file cmos.asm in the current assembler source pack (usually GASx.ZIP, such as GAEa615.ZIP).
WRITING FILESYSTEM DEVICE OBJECTS
Filesystem device objects are different than the normal user device objects. They are considered part of the core set and run at CPL0. When loaded in they couple and decouple with the core in a very intimate way. Only high security objects can load/unload FS device objects.
A filesystem object is required to have the following functions, which are linked into the core internal structures:
- DEVICE_Aquisition
- DEVICE_Device_Information
- DEVICE_Status
- DEVICE_Setup_Environment
- DEVICE_Setup_Device
- DEVICE_Operation
Refer to the CORE documents for more information.
ASSEMBLER
This takes source code which has hyper-mnemonic machine-language instructions and converts them to machine-langauge, data and system control information. As with everything, the assembler is an object in of itself as well. It is the ASSEMBLER object and resides in root space and the assembler folder. If you need a file or a block of memory compiled you load in the assembler object and do so.
There are no switches or the like. You merely supply a filename or a block of memory that consists of ASCIIZ13 data which is NULL terminated and it is converted into an object module. Take this script for example:
ASSEMBLER_Compile_FiletoObjectFile "Test_Assembly"Easy enough. If your assembly policy is set to override for debugging info then it will be so. You should also set your policy to determine where the compiled object will be placed. The default is to put it in the same as the source folder.
ASSEMBLER.Compile_FiletoObjectFile ptr_refID
Takes a refID of a file and compiles that file into an object file
ASSEMBLER.Compile_FiletoObjectMemory ptr_filename
Compiles the file and automatically loads it into memory leaving no trace in the filespace.
This document should be fully HTML 3.2 compliant
(excepting Javascript timedate stamping).
If you encounter problem while browsing,
then email Lewis Sellers.
Copyright © 1995,1996, Lewis Sellers. All Rights Reserved.TRADEMARKS
The Pentium and Pentium Pro are trademarks of the Intel Corporation. TASM is a trademark of Borland International.