Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(gw): extract logical functions to improve readability #8883

Closed
wants to merge 2 commits into from

Conversation

lidel
Copy link
Member

@lidel lidel commented Apr 13, 2022

This is #8855 by @justincjohnson but as a local branch to fix CI / permissions (#8855 (comment)).

TODO

  • cleanup
  • wait for CI
  • get a temperature check from other devs if this is better than the old code

Justin Johnson and others added 2 commits April 13, 2022 21:32
Extract functions from getOrHeadHandler to improve readability and prepare for later refactorings
@lidel lidel marked this pull request as ready for review April 13, 2022 20:24
@Jorropo
Copy link
Contributor

Jorropo commented Apr 13, 2022

I don't have a strong opinion, overall I think it's a good change.
But I don't like how you are doing the webError in the subfunction but since you need to abort getOrHeadHandler too you have to return a bool highjacking the control flow.

What I would expect is the checks in the sub functions but the webError emited in getOrHeadHandler so the control flow make sense. Right now it really feel you are highjacking the control flow with bools.
Like this:

if validateTheBidule(r, beep, bop) {
  webError(r, "Bidule is failing")
  return
}

Or like this:

if err := validateTheBidule(r, beep, bop); err != nil {
  webError(r, fmt.Errorf("Bidule is failing %w", err))
  return
}

@justindotpub
Copy link
Contributor

That makes sense to me @Jorropo. I had similar thoughts when coding it and wasn't sure. Since I don't own this branch, should I just open a new PR with all of the changes?

@Jorropo
Copy link
Contributor

Jorropo commented Apr 13, 2022

@justincjohnson I'll let lidel respond, but FYI if you open the PR from a personal github repo, you wont have thoses issues and we wouldn't need to repush everything. 🙂

@justindotpub
Copy link
Contributor

Yep, I'm slowly learning. 🙂

@lidel
Copy link
Member Author

lidel commented Apr 14, 2022

@Jorropo returning error makes sense for simple check cases, but we have things like fixupSuperfluousNamespace and isProtocolHandlerRedirect which produce a HTTP redirect and should also return to end flow. Is it ok to return bool from these, or do you see a better way that is easier to follow?

Copy link
Contributor

@schomatis schomatis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @justincjohnson! This is a great step in the right direction. Having read this code recently for the first time I wish I had encountered your patch instead of the current giant function.

Similar to what has already been said, we need to report errors in the callee but handle them in the parent (to avoid burying webError calls all over the place). If indeed many of these functions terminate flow even without erroring (and if overloading the errors to signal that seems forced, your call), then these functions should just return two values (bool, error):

  • As usual first check the error: if not nil then report it (webError),
  • if no error, check the termination boolean and just return silently.

Take a closer look at the logic to see where this separation would make sense. For example, I'm seeing

func (i *gatewayHandler) isGettingFirstBlockError(w http.ResponseWriter, r *http.Request, begin time.Time, contentPath ipath.Path, resolvedPath ipath.Resolved) bool {
	// Update the global metric of the time it takes to read the final root block of the requested resource
	// NOTE: for legacy reasons this happens before we go into content-type specific code paths
	_, err := i.api.Block().Get(r.Context(), resolvedPath)
	if err != nil {
		webError(w, "ipfs block get "+resolvedPath.Cid().String(), err, http.StatusInternalServerError)
		return true
	}
	ns := contentPath.Namespace()
	timeToGetFirstContentBlock := time.Since(begin).Seconds()
	i.unixfsGetMetric.WithLabelValues(ns).Observe(timeToGetFirstContentBlock) // deprecated, use firstContentBlockGetMetric instead
	i.firstContentBlockGetMetric.WithLabelValues(ns).Observe(timeToGetFirstContentBlock)
	return false
}

which just seems to return the boolean "stop" just in case of an error. In this case the error itself is the only thing you need upstream to signal termination of the flow as there is no case where we don't have an error but we still need to stop this from continuing.

In contrast:

func isProtocolHandlerRedirect(w http.ResponseWriter, r *http.Request, logger *zap.SugaredLogger) bool {
	if uriParam := r.URL.Query().Get("uri"); uriParam != "" {
		u, err := url.Parse(uriParam)
		if err != nil {
			webError(w, "failed to parse uri query parameter", err, http.StatusBadRequest)
			return true
		}
		if u.Scheme != "ipfs" && u.Scheme != "ipns" {
			webError(w, "uri query parameter scheme must be ipfs or ipns", err, http.StatusBadRequest)
			return true
		}
		path := u.Path
		if u.RawQuery != "" { // preserve query if present
			path = path + "?" + u.RawQuery
		}

		redirectURL := gopath.Join("/", u.Scheme, u.Host, path)
		logger.Debugw("uri param, redirect", "to", redirectURL, "status", http.StatusMovedPermanently)
		http.Redirect(w, r, redirectURL, http.StatusMovedPermanently)
		return true
	}

	return false
}

has a case near the bottom where we want to end the flow (last return true) but there is no error to report. In this kind of function we could use the double return value.

@schomatis
Copy link
Contributor

Thinking some more about this, a function like isProtocolHandlerRedirect is probably a good candidate for even a further subdivision where you have the first part, a validation logic which just returns an error if some of the preconditions fail to hold, and a second part with the actual logic of "is this a redirect" which won't fail and just return true/false and you decide upstream to end the flow with the redirect. This seems a bit cleaner to avoid compounding return values, but it's not a blocker to land this.

@justindotpub
Copy link
Contributor

Thank you all for the feedback. 🙏 All of it seems quite reasonable, and I'm more than willing to implement the suggestions.

If I may humbly share some thoughts (as a newb to both Go and go-ipfs)...

I'm feeling like there is tension between on the one hand the desire to extract smaller functions to make code more readable, and on the other hand the extra logic necessary to push error handling up to the caller and signal whether the caller should return.

When I first started working on adding redirect support to the gateway, I found it challenging to reason about what was happening in getOrHeadHandler due to its size. It seemed like extracting smaller functions might not be a common pattern in the Go community, so I wasn't sure if I should even be doing that. I decided to proceed anyway because it really helped me read the code, and also because it looked like I might need to reuse some of the logic.

When I started extracting these functions, it felt like I should probably be doing error handling at the caller instead of in the extracted functions, like you've been suggesting. In some cases that was pretty straight forward, since all I needed to return was the error to write, and I could do the standard nil check and write the error. Others were more involved though, since they needed to return the error, an extra message, and a status code, so the caller could use those in writing the error to the response, as well as a boolean to indicate if the caller should return and thus halt processing of the request. All of this is doable, but it was starting to feel like the cost of this extra code might outweigh the benefit of extracting a small function for improved readability. I pondered whether that tension had lead others before me to not extract a smaller function in the first place, thus leading to one large function. So I went back to handling the errors in the extracted functions and just passing back a boolean to indicate if the caller should return. That presented some function naming challenges, which lidel assisted with.

I'm going to proceed with the suggestions and see how the result feels, but both approaches feel like tradeoffs to me, and I think I personally will lean toward whatever is most readable.

Anyway, just one person's thoughts while going through the code... 🙂

@justindotpub
Copy link
Contributor

Any thoughts on this one? #8885

@Jorropo
Copy link
Contributor

Jorropo commented Apr 15, 2022

Closing in favor of #8885

@Jorropo Jorropo closed this Apr 15, 2022
@Jorropo Jorropo deleted the refactor/justincjohnson-8855 branch April 15, 2022 08:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Archived in project
Development

Successfully merging this pull request may close these issues.

4 participants