Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix SQE PrepareConnect method #10

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

ianic
Copy link
Contributor

@ianic ianic commented Sep 6, 2023

io_uring_prep_connect requires pointer to C struct sockaddr and len as parameters. To convert between syscall.Scokaddr interface and raw socket pointer and len standard library uses private function sockaddr(). As advised by Russ Cox here I copied few private functions from stdlib to make that conversion for inet4, inet6 and unix domain sockets.

Make tests for connect. Using stdlib server and giouring to connect to that tcp server.

io_uring_prep_connect requires pointer to C `struct sockaddr` and len as
parameters. To convert between `syscall.Scokaddr` interface and raw socket
pointer and len standard library uses private function
[sockaddr()](https://github.com/golang/go/blob/4c5130a96eabd5d9a72a43aa8e895b668fbd653b/src/syscall/syscall_unix.go#L264C35-L264C35).
As advised by Russ Cox [here](https://groups.google.com/g/golang-nuts/c/B-meiFfkmH0/m/PyFC84kPaFkJ)
I copied few private functions from stdlib to make that conversion for inet4, inet6 and
unix domain sockets.

Make tests for connect. Using stdlib server and giouring to connect to the tcp
server.
prepare.go Outdated

// convert syscall.Sockaddr interface to raw pointers
// ref: https://groups.google.com/g/golang-nuts/c/B-meiFfkmH0/m/PyFC84kPaFkJ
func sockaddr(addr syscall.Sockaddr) (unsafe.Pointer, uint32, syscall.Errno) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an easier way to convert syscall.Sockaddr to a pointer. We can link private function from golang standard library as follows:

//go:linkname sockaddr syscall.Sockaddr.sockaddr
func sockaddr(addr syscall.Sockaddr) (unsafe.Pointer, uint32, error)

prepare.go Outdated
func (entry *SubmissionQueueEntry) PrepareConnect(fd int, addr *syscall.Sockaddr, addrLen uint64) {
entry.prepareRW(OpConnect, fd, uintptr(unsafe.Pointer(addr)), 0, addrLen)
func (entry *SubmissionQueueEntry) PrepareConnect(fd int, sa syscall.Sockaddr) error {
addrPtr, addrLen, errno := sockaddr(sa)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me that the conversion here is potentially dangerous. According to the documentation:

As with any request that passes in data in a struct, that data must remain valid until the request has been successfully submitted. It need not remain valid until completion. Once a request has been submitted, the in-kernel state is stable. Very early kernels (5.4 and earlier) required state to be stable until the completion occurred. Applications can test for this behavior by inspecting the IORING_FEAT_SUBMIT_STABLE flag passed back from io_uring_queue_init_params(3).

By allocating a syscall.RawSockaddr* structure and passing a pointer to it to the SQE in the form of a uintptr, we have no assurance that the structure will not be released by the GC before submission

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, we can't create pointers inside prepare methods which can guarantee to be alive until SQE is submitted to the kernel. It's up to the caller to ensure that.

I changed signature of the PrepareConnect to accept addr pointer. Now it's up to the caller to create raw pointers from sockaddr and to ensure that pointer lifetime.

With the introduction of runtime.Pinner in Go 1.21 it is now easier for caller to ensure that lifetime. Callers should pin pointers passed to the giouring SQE and unpin them when they get CQE. For the pointers which kernel will copy, like in this case, it is enough for them to be pinned until that SQE is submitted to the kernel but for some other pointers like buffers in recv client needs to keep them alive and not moved until completion. That's my current understanding of how to use pointers with io_uring.

Note: io_uring was mentioned few times while introducing Pinner API, here.

This way PrepareConnect is similar to other functions which accept addr, addrLen
params. It's up to the caller to make transformation to raw pointers and ensure
that pointer is alive and not moved until request is completed.
Instead of making direct syscall.
@pawelgaczynski
Copy link
Owner

Hi @ianic.

I prepared a pull request with enhancements for prepare helpers. Please see if this solution for PrepareConnect method will be suitable:
https://github.com/pawelgaczynski/giouring/pull/12/files#r1335160648
(I will be very grateful if you take a look at the whole PR)

I checked the solution with runtime.Pinner API:

  • Each SubmissionQueueEntry is wrapped by a structure that has a field of type Pinner.
  • Prepare* method calls the "Pin" method to ensure memory validity.
  • After the SQE submission, the "Unpin" method is called.
    Unfortunately, the performance of this solution is very low:

No pin:

Submitted 143634432 entries in 10.000165518s seconds
14363443 ops/s

With runtime.Pinner:

Submitted 36192256 entries in 10.000359611s seconds
3619225 ops/s

@ianic
Copy link
Contributor Author

ianic commented Sep 25, 2023

It feels strange to me that PrepareConnect method accepts *RawSockaddrAny. I would rather stick with unsafe.Pointer at this time.
func (entry *SubmissionQueueEntry) PrepareConnect(fd int, addr *unsafe.Pointer, addrLen uint64) feels clearer to me.

I assume that client starts with tcp address represented as string. That is IPv4 of IPv6 tcp address in string format. It then uses something like net.ResolveTCPAddr to get net.TCPAddr. That can be converted to syscall.Sockaddr which has underlaying syscall.RawSockaddrInet4 or syscall.RawSockaddrInet6. sockaddr converts that syscall.Sockaddr to unsafe.Pointer and len which can be used in syscalls.
It feels wrong to convert unsafe.Pointer pointing *RawSockaddrInet4 or *RawSockaddrInet6 to *RawSockaddrAny just to convert it back to unsafe.Pointer in PrepareConnect again.

It is strange to pass to the function both pointer to the concrete type, and length. RawSockaddrAny has of course known length, 112 bytes. Here we will probably pass pointer to the RawSockaddrInet4/6 types which are smaller types 16/28 bytes casted as pointers to RawSockaddrAny.

Methods which accept unsafe.Pointer/len can be directly converted to syscalls. I'll stick to that as lower level API we could always build higher level abstractions on top of that.

@ianic
Copy link
Contributor Author

ianic commented Sep 25, 2023

Regarding API for other methods...

Some time ago I commented in #7 that I would like that giouring API is modeled on syscall stdlib methods. Later I realized that there is essential difference between sync syscalls from stdlib and the delayed async calls we have with io_uring. Stdlib can use runtime.KeepAlive and //go:uintptrescapes to ensure that memory passed to kernel is not GC-ed. With io_uring we first have delayed syscall, Enter is not called on each Prepare, memory passed to the prepare must be live and not moved until at least Enter is executed, and for some methods (Recv) until CQE is received. That makes difference so I'm not sure any more that we should just copy stdlib syscall method signatures.

Methods with buffers...
I see two options, lets use Recv as example:
func (entry *SubmissionQueueEntry) PrepareRecv(fd int, buf []byte, flags int)
func (entry *SubmissionQueueEntry) PrepareRecv(fd int, ptr unsafe.Pointer, length uint32, flags int)
Currently we have something in between, passing both []byte and length. I think that is necessary complication, slice has well defined length.
Of those two variants second is probably more explicit in regard that is callers responsibility to ensure pointer liveness but I'm still slightly more for the first.

uintptr or unsafe.Pointer in method signatures...
I feel that unsafe.Pointer give us more options, so I like to see that instead of uintptr.

@baryluk
Copy link

baryluk commented Feb 6, 2024

The same happens when using PrepareSendto

I just do this:

// Borowed from src/syscall/ztypes_*.go
type _Socklen uint32


func socketFD(conn syscall.Conn) int {
        raw, err := conn.SyscallConn()
        if err != nil {
                return 0
        }
        sfd := 0
        raw.Control(func(fd uintptr) {
                sfd = int(fd)
        })
        return sfd
}

// Borrowed from src/syscall/syscall_{unix,linux}.go
// with some adjustments due to .raw being private.
func sockaddr(sa *syscall.SockaddrInet4, raw *syscall.RawSockaddrInet4) (unsafe.Pointer, _Socklen, error) {
  if sa.Port < 0 || sa.Port > 0xFFFF {
    return nil, 0, syscall.EINVAL
  }
  raw.Family = syscall.AF_INET
  p := (*[2]byte)(unsafe.Pointer(&raw.Port))
  p[0] = byte(sa.Port >> 8)
  p[1] = byte(sa.Port)
  raw.Addr = sa.Addr
  return unsafe.Pointer(raw), syscall.SizeofSockaddrInet4, nil
}



.....
func .... {

        remoteAddr, localAddr, conn, err := createUnconnectedSocketWithSourceBind(dst_hostport0[0], source_ip_or_dns)
        if err != nil {
                return nil, err
        }

        fd := socketFD(conn)
        if fd == 0 {
                return nil, errors.New("Failed getting file descriptor of a socket")
        }

        // log.Printf("remoteAddr: %#v", remoteAddr)
        // remoteAddr: &net.UDPAddr{IP:net.IP{0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xff, 0xff, 0x7f, 0x0, 0x0, 0x1}, Port:60001, Zone:""}

        udpRemoteAddr := remoteAddr.(*net.UDPAddr)
        if udpRemoteAddr == nil {
                return nil, errors.New("Not UDP addr?")
        }

        sa := &syscall.SockaddrInet4{
                Port: udpRemoteAddr.Port,
                // IP might be [16]byte, to hold both IPv4 and IPv6,
                Addr: [4]byte(udpRemoteAddr.IP.To4()),
        }
        log.Printf("sa: %#v", sa)
        rsa := &syscall.RawSockaddrInet4{}
        rsaPtr, salen, err := sockaddr(sa, rsa)
        if err != nil {
                return nil, err
        }
        log.Printf("rsa: %#v   salen: %d", rsa, salen)

        // Note: syscall.Sockaddr is an interface.

        // sockAddrAddr, sockAddrLen, err := sockAddr.sockaddr()
        // syscall.RawSockaddrInet4{
        // Family: syscall.AF_INET_4,
        // Port: dst_port,
        // Addr: 
        // }

}

then do this:

                        fd := outputSocket.fd
                        entry.PrepareSendto(
                                /*sockfd=*/ fd,
                                /*buf=*/ data,
                                /*flags=*/ flags,
                                /*addr=*/ (*syscall.Sockaddr)(outputSocket.rsaPtr),
                                /*addrlen=*/ uint32(outputSocket.saLen))

(Currently I use udp/tcp dial to create sockets, I know one can bypass, and just do it via syscalls to socket / bind, etc.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants