-
Notifications
You must be signed in to change notification settings - Fork 117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Porting 8086-toolchain to ELKS #2112
Comments
I see my name exists. Let me know if you want any help. I'm happy to take patches for whatever I've moved to Codeberg. I'm not dead yet! |
One obvious thing lacking is an "ar" implementation. For static libraries. It might be worth to look into Dev86 implementation, too. |
I have temporarily been using the upstream 8086-toolchain to quickly compile up macOS-hosted versions of C86 and NASM, since @rafael2k's version is currently an ELKS-only build. I have been playing its C86 and NASM to get more information on how it works and its continued suitability for an ELKS-hosted 8086-only C compiler toolchain. Build script:
Here are the current results:
Looking at the class slides showing how intended workflow using the toolchain is utilized, it appears C86 is not built for and knows nothing about the idea of multiple input or output files. The school class workflow shows utilization somewhat like the following:
So we could face an uphill battle for certain unhandled situations (e.g. extern functions) and have to modify C86 in order to get NASM to output .o or .obj output that will communicate properly with the linker what to do with various constructs. Examples of problems could be .comm data (e.g. This is not all bad - C86 itself seems to handle the C source I've thrown at it - but creating a "toolchain" out of it might take some work, unless we want to generally produce smaller programs and/or compile and assemble everything at once. I'll help with whatever needs to be done. |
Yes, Dev86 has ar, and it would well suited to use it if Dev86 LD is used. Given my report above, we're ahead of ourselves since C86 doesn't ever produce a GLOBAL or EXTERN directive in its .asm output, so there'd be no symbols to manage, and NASM won't produce a .o or .obj file with undefined externals. So, for now, we're really talking about just getting CPP, C86 and NASM to compile, assemble, and produce a .bin (.com) binary output file with just those three tools. In order to do anything actually useful, we'll use @rafael2k's poor-mans a.out header (likely created with an included .asm file) and the nasm -P option to automatically include it. Something like:
After that, in order to do anything useful (like call an ELKS system call to display something), we can easily produce a syscalls.asm file that can be added to the NASM assembly step, assembling header.asm, file.asm and syscalls.asm into a single a.out-compatible file that will load and run on ELKS. The NCC Project, oriented towards x86-64, uses NASM as well but uses a similar approach for syscalls, where a Linux-compatible list of system calls is linked to provide all system calls. While each system call should be in a separate library function, the NCC approach will work very well to provide a full set of system calls in a single .asm file, just by renumbering the system calls to ELKS' system call list. After that, a CC wrapper program could be built which automatically performs much of this workflow, and hide it from the user. Even though programs might be quite a bit larger in the beginning than necessary (because of the inclusion of all system calls or even perhaps a full mini-libc in a single source file), it could all be made to work. I like C86, but the big disadvantage is that we're not starting with a toolchain, instead we're having to build one. Lots of work, but fun. |
More news: on the upstream 8086-toolchain class online resources page, there's a link to known tool problems which discusses some known problems with C86. While most are seemingly OK, like having to declare all local variables at the start of a function rather than anywhere, there are a couple issues that could be very problematic for porting any larger piece(s) of code: the compiler apparently has problems dealing with multiple C statements on a single line separated by semicolon, as well as having register allocation problems when more than one C operator is used in an expression at once (which means porting the ELKS library code will be problematic), and then a killer problem of the usage of long/unsigned long (any 32-bit arithmetic) not working well. I've asked the 8086-toolchain maintainer for a copy of the C86 Manual, which is currently a dead link. Hopefully it can be found and we can read more about what C86 does and doesn't do. |
Oh... Reading this and your following answers, it seems c86 has some problems that make it problematic for use for anything more than small and fast projects. At least without any more thorough changes to the compiler. Still, it might be useful in some cases. Btw, ar might potentialy be used directly with nasm and ld86. That of course requires for project to be written entirely in assembly. |
Hi all. I got the cpp and ld from https://github.com/lkundrak/dev86 |
Yes, I meant to mention this when I had the chance, but you beat me to it! c86 has some limitations that may not make it suitable as a general-purpose 8086 C compiler. It was only used to compile a toy Real-time Operating System for the 8086 emulator. But y'all seem like you know what you're talking about with compilers, so perhaps you might be able to fix those limitations. I myself have not done much compiler work, so I probably won't be of much help here. |
As far as the license and history goes for the project, here is what my professor, James Archibald, said to me via email (on 2024-11-20):
So I believe you are free to extend and use as you see fit. |
C86 has some info on licensing in cmain.c (main.c in earlier versions)
|
An oral authorization is enough in my opinion, as this implies no one will try to sue us in the future. |
Unrelated to the question of licensing (not a lawyer, anyway), there is some documentation about c86 here: http://retro.co.za/68000/CC68K/QDOSC68K/c68.txt Although, this is related to a version maintained by the Walkers (Keith and Dave). I wonder if ECEn 425 version is based on this or on an earlier version? Also, I wonder if there is a newer release by Keith and Dave? Other parts of toolchain on that page are for QDOS (Sinclair QL). So, not really interesting for us here. |
There are two versions of C86 sources here (one from 1998, one from 1999): |
Found more docs (for QL version): |
I found a copy of the compiler's homepage: Also available on Internet Archive: |
@ghaerr sent this message:
|
From http://retro.co.za/68000/CC68K/QDOSC68K/c68.txt See ghaerr/elks#2112 (comment) I believe this is the original c86 text manual referenced in the class, though I don't know if there were any modifications. But the size (~146.4 KB) is close to the 147 KB expected size, so I suspect any modifications made were small.
That's a different project, as far as I know. It was in development since early nineties, maybe even before. As far as I know, it was written from scratch and it has always supported 386 only. It never had a support for 8086. |
Oh. I see I am wrong. I was sure it was written from scratch. |
One thing I was right about, it only emits 386 code:
And, it has been in development for so long, it's probably significantly changed compared to the original. Might have as well been rewritten a couple of times. :) |
Here is the manual of the compiler version I'm using at 8086-toolchain: The issue in 8086-toolchain upstream repo: hintron/8086-toolchain#13 |
I found Mathew Brandt's version: Original copyright notice:
|
I might be wrong about this too. I can vaguely place it in the late 90s, althouh the earliest version I found is this one from 2000 (ccdl*.zip): |
We need to compare to understand how different it is from the one in the 8086-toolchain. No problem in changing the source, if it makes sense. |
Found an early version from 1996 (ccdl122.zip). This version also supports only 68k and 386. |
But, what I can tell you is, it likely doesn't fit public domain definition. It's also not OSI compliant. That doesn't mean it's not hackable, changeable or usable. It just means it's not really open source in OSI sense of the word and has limited usage and distribution permissions in comparisson to GPL, BSD and MIT licensed stuff. |
Btw, @ghaerr, can you advice on adapting ld86 from elks aout v0 to v1? |
The a.out .version field was changed from 0 to 1 to indicate that the interpretation of the previous 32-bit .chmem field's upper 16 bits were split off into a new 16-bit .minstack field that occupied the same space. This allowed the developer to specify a separate heap value in .chmen and stack in .minstack. Nothing else was changed. ELKS can load both V0 and V1 executables, so no immediate change is necessary. At some point ld86 could be enhanced to output V1 executables by adding a min stack size command line argument and writing it in the revised header, along with version = 1. For now, since the small executables that are likely going to be built using either your poor-mans header or ld86, specifying either v0 or v1 will work the same, as both the chmem and minstack fields are zero anyways. I can help more with this when ld86 is actually need by c86, as it seems for now there won't be any need for ld86 or ar until c86 is enhanced to output GLOBAL and EXTERN directives for function and data symbols. What this really means is that, for now, a poor-mans a.out header can be fairly easily implemented with no c86 modifications using a pre- and post- .asm file around the NASM-assembled c86 output, with NASM then creating the a.out file directly using its -f bin option. The pre- header will list some internal symbols for start of text and data along with the a.out structure itself at address 0, and the post- header will calculate the length of each for inclusion in the a.out header fields. |
After installing OWC (was easier than I have expected), now for
I am using the binary version of OWC (open-watcom-2_0-c-linux-x64) and following:
I am using the 64 bit version of OWC. |
Doing
|
Well, I tried building ELKS, but had some problems. Luckily, the toolchain built ok, so I played a bit with it. I can see what Greg was talking about curses. ELKS currently doesn't have a curses implementation outside of Greg's workaround in elkscmd/tui. I tought ELKS had a small curses implementation from before. Guess I am wrong. |
But, Greg's curses is a nice and small implementation. I might hack on it to implement stuff the game I'm currently playing with porting requries. |
Hmm,ow-libc github workflow/action is able to build libc using the OWC, which means that something must be misconfigured on my end and that is why I can not compile libc with OWC. |
I updated my libc/wcenv.sh as described here: https://github.com/ghaerr/elks/wiki/Using-OpenWatcom-C-with-ELKS Still nothing (that was not the problem I think):
|
Wow, this is awesome! Habemus libc!! |
How about good ol' jove? I have a plan to port Jove Jonathan's Own Version of Emacs as soon as I come back. Do you all think it is a good emacs clone to be ported? It is known to run in large model already, so it is a certain bet - it will work on ELKS for sure. I created an issue in jove's issue page: jonmacs/jove#22 |
Yes, OpenWatcom is required in order to build the 8086 toolchain. Some of the programs use ia16-elf-gcc, and the others use OWC.
The problem is that stdarg.h is not being found. This is an OWC header file, which is not present in the toolchain or ELKS. The line that controls this is in wcenv.sh (in my case):
Change directory to that directory, and you should be able to see an 'h' directory there, which are all the header files.
So it appears that you might not have the headers installed with OWC, or they are somewhere else?
Try running '. libc/wcenv.sh' instead. ELKS and toolchain only need the WATCOM= variable set, and the $WATCOM/binl64 directory is likely where the binaries are, and the $WATCOM/h directory is where the headers should be. |
Please, go ahead. I have found the super small implementation works well for almost all the curses apps that don't require screen windows (thus requiring saving screen contents, etc) but just need cursor control, colors, etc. I have been thinking about moving the small curses into the C library so that the programs don't have to live in elkscmd/tui. |
Sure. Go ahead. But I'll still want to try porting some of the other ones. :) It's no problem if there are more of them. Only one will probably go to one of the main disks, other ports will have to go on contrib disks. :) And stop hacking on your vacation. You are suppposed to have fun (other kinds of fun!). :) |
I see that my header folder in watcom is called "lh" andf not just "h". I seems libc compiles without error, but I do not get a file called /root/elks/libc/libc86.a in the end. This is a problem for the toolchain. |
This is my libc after compilation:
And the toolchain searches for a libc86.a in this folder. |
lh is probably linux headers, which are not what we want - they're oriented around 32-bit. I don't know what the WATCOM= directory looks like that you've installed, but you can't use lh. I would have to see an ls -l of the WATCOM= dir to help you on what is included or not in your installation. Usually, WATCOM= ends with '/rel': WATCOM=/Users/greg/net/open-watcom-v2/rel on my system.
We're getting confused here: you first have to build the OWC libc for other ELKS programs to link with, THEN you can build the C86 toolchain. The OWC library is built - do an 'ls -lt' and you will see libc.lib. This is the library that the 8086 toolchain links with. AFTER the C86 toolchain is built, then you get to rebuild the ELKS C library for C86 with C86. That will build the library libc86.a. If you are having problems building the OWC library, try the following: It should give an error if it can't build it. I put the following commands in a shell script called 'x':
Then run './x > file'. This will display only the errors, with the normal messages going to 'file'. The C86 library is built the same way as above but you can use c86.* instead of watcom.*. Complicated? Yes! To recap: first you build C library with ELKS, then you build the C library with OWC, then you build the C library with C86. Each has a different name for the library, and the tools built by each compiler can only be built when that compiler's C library has been built. The C library is the same in all three cases, but built with different options. |
I have this in /usr/bin/watcom:
There was no h, I added it. Make host in the toolchain gives:
|
|
When installing watcom there are some options. Maybe I did not select something? |
Yes. You need to do a full install! |
Looks like you installed the "linux" version of OWC, which is why the lh headers are there. Delete the h dir you created, it is not correct and is now confusing the issue. Things may seem to work but they won't properly and we'll be chasing more problems. I can correct this by copying the required OWC .h files into our repo. I didn't want to do this, but now see its probably better, at least for the files in our C lib that require them, like stdargh.
You must now also build the C86 library, don't worry about the examples/ directory issue. CD back to $ELKS/libc and do:
Then you should have libc/libc86.a. I'll look into fixing this also so that the examples/ doesn't fail first time.
That's not great, since it appears that OpenWatcom is exporting its own headers for its compiler. I think our build overrides that though. So this is OK, except your OWC toolchain is probably building the ELKS OWC library with the wrong headers, which ultimately mean that any ELKS program built with OWC, which includes most of 8086 toolchain, could be incorrect. As I said above, I'll look into fixing this so you don't have to worry about the fact that you only installed the Linux OWC, not the DOS OWC. We don't want to use either set of headers, now that I think of it - so I'll copy them all and this issue will be moot. |
Probably. But I'm going to add the proper headers anyways, as otherwise our repo build could be using incorrect headers much like you've done and nobody would know, which would then generate bug reports for installation issues, not bugs in programs. I'll let you know when I have done that, and then you should be able to just do a git pull and rebuild. I will try to add an upper level make entry point for the OWC build, and then the C86 build, as well. |
I did a full install. |
No. /root/8086-toolchain/libc/libc86.a is not what is needed to compile the examples. |
No - examples fails because the script just finished building C86 toolchain, and then tries to compile examples when the C86 library hasn't been built - which is over in the ELKS repo. I will undo the change of building examples with building C86, it only works on my system because a previous build built the C86 library. Do you see?
Correct. The 8086-toolchain/libc is now not used. I only left it in so that "examples" which didn't use the ELKS C library could still be used with the toolchain. But there's not much point in that now, and its confusing. I will remove that also. |
〉 I didn't want to do this, but now see its probably better, at least for the files in our C lib that require them, like stdargh. Well, I didn't want to do this too since OpenWatcom can be upgreaded their own repo. It was easy if proper 16bits compiler are checked in custom install or full install... I heard there is a user who modified OWC itself for PC-98, I'll let him know h place changed if needed. |
This issue has gotten very long, so I'm closing it and replacing it with #2159. |
@tyama501, I have copied over the required stdarg.h and a couple other standard headers into the ELKS repo. That solves the problem of whether or not someone installed OWC for linux vs DOS, etc. The header files that I copied will work regardless of the OWC version installed. In other words, we no longer care about which version of the OWC headers are installed, because we don't use any of them from the OWC repo. I have reversed my opinion because any headers that are used must be compatible with our version of ELKS libc, so it is OK if a header is missing - an error will be thrown and we can see it and fix it. Using OWC headers without looking at them means previously an OWC header could be used that was only compatible with their library - and we're not using the OWC library, we are always using the ELKS C library. Ultimately this is because OWC does not actually provide support for ELKS. |
This is a continuation of the discussion in #1443 (comment), regarding issues getting what is hopefully the latest version of a C86 compiler and @rafael's port of its included (older) NASM assembler running on ELKS.
At the moment, there is some consideration of using Dev86's CPP C preprocessor, producing Dev86-compatible AS86 format object file out from NASM, and possibly using Dev86's LD linker, as both CPP and LD are (hopefully) likely to be easily ported to the ELKS 8086-only environment.
I'm not sure where the best current sources are for Dev86 - it used to be that @jbruchon hosted them on Github, and that versions' upstream is quite old, but still present: https://github.com/lkundrak/dev86. It seems that jbruchon has moved his version to Codeberg at https://codeberg.org/jbruchon/dev86. During the last four years, I am aware of a number of bug fixes posted to his repo when it used to be on Github. I would recommend starting with jbruchon's Dev86 unless another more updated version is found on Github.
ELKS shares quite a history with Dev86, just five years ago the entire kernel and C library were compiled using its BCC->AS86->LD toolchain. The ELKS C library had originally bin in dev86/libc but had been moved prior to that.
While it could make sense to use Dev86's CPP and LD in order to get C86 running more quickly on ELKS, unfortunately the BCC compiler is K&R only, and doesn't support ANSI C at all.
@rafael2k, which repo are using for your CPP and future LD ports? I would assume that if you can get them running, both will be moved into your https://github.com/rafael2k/8086-toolchain repo.
The text was updated successfully, but these errors were encountered: