Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

patch for m68k floats #156

Closed
wants to merge 10 commits into from
Closed

Conversation

czurnieden
Copy link
Contributor

Saw that mp_get_double has troubles with the m68k architecture . It is a blind patch, I was not able to persuade qemu to do its deeds and my LC II has no FPU (Yes, I am aware of https://www.shiga-med.ac.jp/people/sugimoto%27s/mac/lc2.html ).

@minad
Copy link
Member

minad commented Feb 8, 2019

@czurnieden Is there a possibility to fix the existing set_double function instead of providing an additional function for mk68 (by accessing the bit pattern). Alternatively I suggest to simply disable set_double support on mk68 as we also do on other platforms which don't provide IEEE754. We can provide a fix as soon as someone comes by who has a working mk68 qemu instance.

@dod38fr
Copy link

dod38fr commented Feb 8, 2019

Debian wiki mentions: "As upstream QEMU currently does not provide full m68k CPU support, you need to use a patched version of QEMU which has full m68k CPU support developed by Laurent Vivier. Laurent's version can be found on github".

This page also provides instructions to build qemu-m68k on Debian.

@czurnieden
Copy link
Contributor Author

(@dod38fr Laurent Vivier's version does not compile for me, does not find installed libraries, maybe something in the build process, had no time to check it yet.) Was able to get sid's qemu to run which made the error replicable. Sort of, because the error message in demo/demo.c could be a bit more verbose.

Without this patch but the replacement

printf("\nmp_get_double(+dbl) at i = %d bad result! %20.20f != %20.20f  - %llx != %llx \n",i, dbl, mp_get_double(&a),(uint64_t)dbl, (uint64_t)mp_get_double(&a) );

I got

mp_get_double(+dbl) at i = 0 bad result! 1656331715888600576.00000000000000000000 != 1656331715888600576.00000000000000000000  - 16fc78eb83c0d600 != 16fc78eb83c0d600

which seems to point to a bug in qemu's FPU-emulation. It would be nice if somebody is able to verify that with the patched qemu version mentioned above.

Or switch it off completely for m68k as @minad suggested.

@czurnieden
Copy link
Contributor Author

Was finally able to build Laurent Vivier's version (had to add /PATH/TO/libibverbs.a /PATH/TO/librdmacm.a to the variable COMMON_LDADDSinqemu-m68k/m68k-linux-user/Makefile`).
With the very same result.

@minad
Copy link
Member

minad commented Feb 9, 2019

@czurnieden What is the result in qemu if you use the existing mp_get_double function? Is there an endianness problem?

@czurnieden
Copy link
Contributor Author

@minad I honestly do not know what the problem is. It is not an endian problem in set_double, the original version shows the same behaviour as the endian-ignorant one. The problem lies most likely in QEMU. One of the test-loops shows errors if input is not a power of 2 but slightly below (see dempo/demo.c, it is annotated) but only for i in {53..63}. The original loop works now. It had an odd behaviour: The first run (i == 0) had a very large number with a failure, works ok if skipped, that is, if the first rand() calls were not used. drand48() showed the same behaviour.

I finally gave up and switched them off for now. Builds and tests ok without. New problem:

@dod38fr Lintian returns an error:

E: libtommath1: shlib-with-non-pic-code usr/lib/m68k-linux-gnu/libtommath.so.1.1.0
E: Lintian run failed (policy violation)

@minad
Copy link
Member

minad commented Feb 9, 2019

I think turning it off is a good idea in that case.

@czurnieden
Copy link
Contributor Author

Found out that the m86k fpus have the same "feature" as e.g.: the x86: hidden bits of extra precision. See e.g.: "The pitfalls of verifying floating-point computations" by David Monniaux arxiv.org and the GCC manpage (entry for -ffloat-store).

Added a small entry in makefile_include.mk to take care of that.

Builds and tests ok now, but the Lintian error is still there.

@minad
Copy link
Member

minad commented Feb 10, 2019

There is also fexcess-precision=standard. Does that work too? c99 enables this flag by default.

@czurnieden
Copy link
Contributor Author

@minad that flag influences casts, too, and I wanted to change as little as possible. Oh, and Debian needs C90 compliance as far as i understand it. But otherwise: I see no problem here but it should be documented somewhere.

@dod38fr the Lintian error is caused by

$ eu-findtextrel .libs/libtommath.so.1.1.0
./bn_mp_exptmod.c not compiled with -fpic/-fPIC

Couldn't find anything quickly, no idea, sorry.

@sjaeckel sjaeckel requested a review from minad February 10, 2019 22:29
Copy link
Member

@minad minad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only changes to the tests and compile flags. Looks good.

@dod38fr
Copy link

dod38fr commented Feb 11, 2019

libtommath with this PR builds fine on my system. Please make sure that you have an up-to-date lintian on Debian/unstable.

Anyway, I've uploaded libtommath patched like this PR as version 1.1.0-3~exp2. The build still fails with :

Testing: mp_get_double
mp_get_double(+dbl_count) at i = 53 bad result! -9007199254740992.00000000000000000000 != -9007199254740992.00000000000000000000

HTH

@minad
Copy link
Member

minad commented Feb 11, 2019

@czurnieden It seems the ffloat-store option is not applied in the build log. Is the ARCH condition in the makefile correct for the cross compiler used by the Debian build server?

@czurnieden
Copy link
Contributor Author

@dod38fr My buildd system is, despite all efforts, significantly different from that on m68k-do-02.buildd.org?
Great. *sigh*
Lintian is the newest says sbuild and I'm afraid it isn't Lintian, it is an actual problem with libtommath. But one thing after the other.

*strokes his sequipedestrial beard*

OK, so let's get rid of all that fancy-schmancy newfangled compiler roptions and do it the old and tested way: measure the relative error and check against DBL_EPSILON.

For those of you who want to try it themselves (or were sent here by Google):

Note: my host system is based on Mate 17.1 "Rebecca", a quite old Debian stable system.

We are on the host.

Get a debian-testing-$ARCH-netinst.iso from somewhere trustable, e.g.: https://cdimage.debian.org/cdimage/weekly-builds/

Build a virtual harddrive with 20G space (way too big, even 10G is more than enough but it clutters quite quickly if you are me ;-) )

qemu-img create debian64formk68k.img 20G

Start QEMU and Install (you can take all the defaults with the exception of the keyboard layout if you have a different layout than the english one and you need to install the sshd for later communication). -display sdl is used because QEMU wasn't compiled with GTK support here, no other reason.

qemu-system-x86_64 -smp 2 -enable-kvm -k de -rtc base=localtime -soundhw ac97 -vga std -display sdl -hda debian64formk68k.img -cdrom debian-testing-amd64-netinst.iso -boot d -m 2048

I installed a desktop (Mate, the default Gnome is way too slow with QEMU), too, to make things more comfortable for me but your mileage may vary.

You cannot simple reboot here, you need to start QEMU again, slightly different, to boot from the virtual disk.

qemu-system-x86_64 -smp 2 -enable-kvm -k de -rtc base=localtime -soundhw ac97 -vga std -display sdl -hda debian64formk68k.img -boot c -m 2048

We are now in the guest system.

Set display size if needed and make yourself comfortable in the new system by installing all your regularly needed tools like your editor of choice.

Once done you can follow the advice at https://wiki.debian.org/M68k/sbuildQEMU which worked for me with the exception of the problem with the missing libraries describe somewhere above in this thread.

Get the newest patch from https://packages.debian.org/de/source/experimental/libtommath (at the end, several files, al are needed) which is currently libtommath-1.1.0-3~exp2 and unpack it. Please check http://ftp.debian.org/debian/doc/source-unpack.txt if you are unsure of how to do it manually.

cd into libtommath-1.1.0 or whatever the version is called at your time of reading.

I start sbuild as follows:

sbuild --purge-deps=never --source --arch-all -d sid --arch=m68k

It either works or not. If not, the most likely outcome if you have the need to do it locally, edit the code and make a patch with:

dpkg-source --commit

It is interactive, just fill out the forms. You don't need to edit anything but an individual patch-name is needed.

If you mistyped the patchname or have other reasons to get rid of it you need to delete that patch in ./debian/patches/ and in the list at ./debian/patches/series

Start sbuild again:

sbuild --purge-deps=never --source --arch-all -d sid --arch=m68k

Rinse and repeat until you get a clean build.

That invocation of sbuild does build in an individual directory every time. It is quite time consuming because it will build the whole stuff again and again, but I found no way to persuade sbuild to use the same build directory every time (if you know how you can add --purge-build=never to the sbuild command to keep it. It would also be nice if you tell me how you did it ;-) ).

There are many ways for a communication between the host and the guest. The simplest way if you are in a local network like most of us who are behind a router is via scp.

Run sshd on both guest and host.

Find the IP address of the host, e.g. with ip a. Let's say it is 192.168.0.7

Now you can use scp on the guest(!) like e.g.:

To copy a file from guest to host:
scp file_to_copy [email protected]:/home/host_user/file_to_copy

To get a file from the host to the guest
scp [email protected]:/home/host_user/file_to_copy file_to_copy

It also works if you do not have a local network but I find it it is easier to secure thet way. YMMV, of course.

@czurnieden
Copy link
Contributor Author

Works in my local buildd environment. Let's see if it works elsewhere, too.

Oh, and it was indeed a problem with the extra precision. The recursions x_{n+1} = 2x_n -1 with x_n = 2.0 and n < 301 test it quite effectively-the error comes at n = 53 and is gone at n = 64, the mantissa difference between a double and a long double and the difference between in- and output at n=53 is 1.0000000000000... despite printing as the same number with printf.

@minad
Copy link
Member

minad commented Feb 11, 2019

@czurnieden Thank you very much for your efforts! I guess we should not bother too much and just switch the tests off on mk68 if we cannot get it to work in reasonable time. One remark about the change you made - now you are testing using DBL_EPSILON, which is not good enough on IEEE754 compliant platforms. I suggest you use an ifdef in s_compare_doubles to enforce exact equality on those.

Copy link
Member

@minad minad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s_compare_doubles should use exact equality on IEEE754 compliant platforms.

@sjaeckel sjaeckel added this to the next milestone Feb 11, 2019
@czurnieden
Copy link
Contributor Author

@minad What? Oh, you are killing me! ;-)
But you are right, should use it exclusively for non IEEE754 compliant platforms. Especially for those who lied about it ;-)

I see that Steffen had already put it to the next milestone. So if you are not OK with it or it fails tests in a way that is not quickly repairable(The Lintian error is still dangling) we should kick m68k support for the get/set_double out of the program. I've already done it in one of teh earlier commits, but I doubt that you can simply merge that old stuff without further massage.

As far as I remember: the very last deadline for the Debian freeze is tomorrow, so it should happen today.

@minad
Copy link
Member

minad commented Feb 11, 2019

Ok, I suggest to do it as follows:

  1. just disable the double tests on mk68k
  2. the x32 relocation issue still needs work, but debian can live for now with its own patched version.

minad added a commit that referenced this pull request Feb 11, 2019
@dod38fr
Copy link

dod38fr commented Feb 12, 2019

@czurnieden Debian soft-freeze is indeed today. On the other hands, x32 and m68k arches are not part of the "official" release, i.e. a build or test failure in these packages will not stop new libtommath package to enter next Debian stable release.

As a matter of fact, libtomath 1.1.0 is already part of testing (See https://tracker.debian.org/pkg/libtommath) so it will be shipped in Buster (the code name for next Debian stable release).

So, there's no need to rush a fix for x32 or m68k.

All the best

@minad
Copy link
Member

minad commented Feb 12, 2019

@dod38fr Thank you for clarification!

@czurnieden
Copy link
Contributor Author

@dod38fr

So, there's no need to rush a fix for x32 or m68k.
I'm an old fart, I don't rush anymore ;-)

The fact that Lintian failed in the m68k build is caused by adding -fPIE by way of adding /usr/share/dpkg/pie-link.specs and /usr/share/dpkg/pie-link.specs to the build of the shared library. That does work for most architectures but is not guaranteed. It's also in the name: the "E" in "PIE" stands for "Executable". The flags to enable PIC on the other side are necessary for hardening and must stay (I think it is a Debian "must have"?).

I could switch it of manually in makefile.shared with additional LDFLAGS

ADDITIONAL_LDFLAGS= -Wl,--warn-shared-textrel -Wl,--fatal-warnings -fno-PIE -no-pie
LTCOMPILE = $(LIBTOOL) --mode=compile --tag=CC $(CC)
...

And put them behind LDFLAGS like $(LDFLAGS) $(ADDITIONAL_LDFLAGS)everywhere but fortestandtest_standalone` because those are hungry executables in need of some PIE. See also the entry for "pie" in https://manpages.debian.org/testing/dpkg-dev/dpkg-buildflags.1.en.html#hardening where it is even recommended.

Ah, forget it, made a commit with the changed makefile.

Those additional flags work well (with the s_compare_doubles patch) as you can see:

Build Architecture: m68k
Build Type: full
Build-Space: n/a
Build-Time: 1081
Distribution: sid
Host Architecture: m68k
Install-Time: 19
Job: /home/czurnieden/M68K/LTM_EXP2/libtommath_1.1.0-3~exp2.dsc
Lintian: pass
Machine Architecture: amd64
Package: libtommath
Package-Time: 1179
Source-Version: 1.1.0-3~exp2
Space: n/a
Status: successful
Version: 1.1.0-3~exp2
--------------------------------------------------------------------------------
Finished at 2019-02-13T12:51:21Z
Build needed 00:19:39, no disk space

Problem: It all works nice and dandy, yes, wonderful, but on my own machine, a normal x86_64 building for x86_64:

gcc-4.8.real: error: unrecognized command line option '-no-pie'

That is caused by a missing --enable-default-pie for the ./configure script for GCC. So it is only valid for the tested GCC 8.2.0-20 and a Debian build environment of the same age. Not a problem for you but for the rest, so it would be a Debian-only patch.

I do not know if the problem with PIE is the cause for the x32 problems, too, forgot if I checked it before applying the patch but I doubt it.

@sjaeckel sjaeckel changed the title patch for m86k floats patch for m68k floats Mar 1, 2019
@sjaeckel
Copy link
Member

sjaeckel commented Mar 4, 2019

closed in favor of #160

@sjaeckel sjaeckel closed this Mar 4, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants