Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

captain.log - go-ipfs #2247

Closed
whyrusleeping opened this issue Jan 26, 2016 · 14 comments
Closed

captain.log - go-ipfs #2247

whyrusleeping opened this issue Jan 26, 2016 · 14 comments
Labels
need/community-input Needs input from the wider community

Comments

@whyrusleeping
Copy link
Member

January 26, 2016

Working on figuring out how to set up various networks virtually for testing. Found a couple GUI tools that appeared to work, but seemed to have a fairly high learning curve (and werent easily configrable in the ways i wanted):

Now i'm working on using dockers network subcommand and getting frustrated with how useless it is. it does 60% of what i want, but does it in a way that makes it impossible for me to complete the final 40%. I'm hoping the dev's get back to me eventually on a PR i filed to maybe make this easier (they probably will continue to ignore it).

@Kubuxu
Copy link
Member

Kubuxu commented Jan 26, 2016

Have you thought about scripting on top of just network namespaces. It is essentially what everything uses to achieve separation: https://lwn.net/Articles/580893/

@whyrusleeping
Copy link
Member Author

Yeah, thats actually what i'm falling back to now. Its looking more promising than dealing with docker stuff

@whyrusleeping
Copy link
Member Author

sent out an email or two, and now i'm changing focus to ipfs files work in the hopes of getting npm on ipfs unblocked.

@whyrusleeping
Copy link
Member Author

been working on getting libp2p extracted. The big goal is to have gx working nicely for go-ipfs so we can easily extract and modularize it. its 98% of the way there, just working on some testing CI stuff now that seems to not get along with the new vendoring scheme. should be resolved soon.

In other news, i've been noticing some increased memory usage on latest master. Still needs to be inspected and investigated...

@whyrusleeping
Copy link
Member Author

Captains Log Update, Feb 23rd

Inspired by @diasdavid, I think its a good time to write up an update here.

We're well on our way towards shipping an official release for 0.4.0, the features are all ready, the obvious bugs have been stomped on, and all thats left now is to convince ourselves that it works as intended. Towards that end, @dignifiedquire and I (mostly @dignifiedquire so far) have been working on constructing a test suite that will really stress out the codebase. The plan is to pump gigabytes (or more?) of data into an ipfs node every which way, and do sanity and integrity checks along the way, to make sure we arent screwing anything up. That work has started over here.

In other news, since our last update we have begun the transition to using gx for all of our vendoring. This is really cool for a few different reasons, first: it allows us to neatly vendor our dependencies without comitting them all to our git repo, and making git clones take ages. It also allows us to more easily break up the ipfs codebase into smaller subpackages, since before 'breaking the packages up' meant moving them into the godeps folder and being frustrated whenever we tried to update them (don't search the irc logs for things i've said about vendoring, its not pretty). Another really cool side effect switching to gx has had is that we now get to really nicely dogfood our own system, since gx uses ipfs under the hood, every time we install ipfs from source, or run CI, we're putting more usage on the codebase.

On to my final update, you might have noticed that your PR's test runs look a bit more green lately, a big thanks to @noffle and @chriscool for pushing a bunch of fixes to our tests through and making our CI more reliable. This is super awesome as it gives us more confidence about the code we're shipping, and lets us ship code more quickly (no more spamming the re-run button).

How can you help?

There are a few things that will help move go-ipfs along more quickly:

  1. first off, take a look at our 'easy issues' list and pick something up from there, or if you want more of a challenge, we also have labels for 'moderate difficulty' and 'hard difficulty'.
  2. Help out making our windows CI tests greener, theres a lot to do here, and @chriscool has been taking lead here, but every little bit helps! Take a look at the appveyor failures on any recent PR, and see if you can't address the issue
  3. If anyone wants to help out with the gx vendoring, any of the packages in the thirdparty folder are great canidates to be broken out of the main repo into their own repos, and vendored back in with gx. If you want to take one or more of these on, let me know in irc or on a github issue if you need any guidance. I'm still working on the UX of gx, so there are likely to be some rough spots.

Thanks!

@whyrusleeping
Copy link
Member Author

whyrusleeping commented Apr 18, 2016

go-ipfs Q2 roadmap

I wrote up a roadmap for go-ipfs in Q2 2016. If you see something on this list you want to do, if it has an issue, claim the issue there. If there isnt an issue for it, go ahead and open one.

Attention New Contributors

If you're looking for something fairly easy to do, check out the Misc Issues section at the bottom of this list, most of those tasks are isolated easier things that we would love to have completed.

  • fix random test failures
    • check issues for label 'test_failure'
  • Improve memory usage
    • Goal: ipfs doesnt run out of memory on mars for 1 month
    • identify where most memory usage comes from
    • find and fix leaks
      • providers
      • multiaddr storage
      • mfs caching
    • add more IPFS_LOW_MEM knobs
      • dht query parallelism
      • libp2p dial concurrency
  • unixfs sharding
    • unixfs sharding spec (@jbenet)
    • revive old branch for this
  • go-iprs
    • ipld?
  • gateway code in its own repo
    • move ipfs/go-ipfs/commands to ipfs/commands
    • move ipfs/go-ipfs/core/corehttp to ipfs/gateway
    • vendor all back in with gx
  • fix progress bars
    • no negative infinity! (before total size is computed)
    • 'ipfs get' progress bar working
  • Reduce idle bandwidth usage
    • Goal: idle less than 100kbps
      • measure current idle BW loads to make sure this is reasonable
    • investigate why dht is so chatty
      • watch ipfs log tail
    • make fewer calls to findpeer and findprovs
      • identify where all these calls are made
    • fix 'ipfs stats bw' discrepancy
    • have test (using mocknet?) that tests bandwidth usage
  • Make bitswap waste less bandwidth
    • Goal: 50% reduction in duplicate block sends
      • Test to measure this
    • reintroduce pluggable strategy
  • Fix Providers Problem
    • Goal: 10x bandwidth reduction from providing
    • provide-many redo
  • Reduce bitswap RTT
    • decide on pathing language
  • Improve Disk performance
    • Goal: 1TB repo working nicely
    • Goal: 2x improvement in 'ipfs-whatever' ops/s
    • large repo perf Big repo causes resource hog hangs #2405
    • investigate flatfs calls
      • reduce number of useless calls
    • dagservice lru caching
      • configurable size
    • blockstore 'has block' bloom filter cache
    • improve query perf
    • different datastore backends
      • datastore config PR
      • introduce sql-datastore
  • NAT Traversal (go-ipfs side)
    • DHT FindPeers only returns first result
  • Misc Issues (great place to start contibuting!)

@daviddias
Copy link
Member

daviddias commented Apr 18, 2016

As discussed on the hangout of April 18, I'm listing the items I'm interested the most and that will unblock things in this quarter, ordered by priority:

@whyrusleeping
Copy link
Member Author

Quick Q2 update:

The yamux hang fixes have resolved a lot of what we previously felt were NAT traversal issues. I'm going to lower the priority of further NAT traversal work in favor of resource consumption tasks. This includes making bitswap smarter, improving the dht providers problem, and getting utp integration to a point where we can rely on it.

@whyrusleeping
Copy link
Member Author

whyrusleeping commented Jun 13, 2016

Q2 update

Q2 is ending soon, we've got a lot of stuff done already. First off, i'd like
to thank everyone who has helped push ipfs forward. In no particular order,
thank you @Kubuxu for all the work extracting packages into gx, tagging and
responding to issues, and pushing a bunch of fixes across the codebase. Thank
you @RichardLitt for all the work improving our cli docs and generally making
it much easier for users to intuitively discover how ipfs works. Thank you
@kevina for taking point and getting a working implementation of 'zero copy
ipfs adds', This is a highly requested feature that we are excited to have.
Thank you @chriscool for helping improve our tests and reviewing all of our
build system code, It's great to have such experienced input. And finally, to
everyone else who has been contributing, filing issues, fixing bugs, testing
things out and helping out in the community, ipfs wouldnt be what it is without
you all.

That said, we still have a few things to get done if we want to stick to the
goals we set back in April. Most of these are related to overall ipfs
stability. Memory usage in ipfs still grows steadily over time due to storing
information about providers and peer addresses in memory, instead of on disk.
We're also abusing the disk pretty heavily too, in ways that can be pretty
easily improved by adding some small caches in certain places.

Things left to do for Q2:

Taken off the Q2 milestone

  • add lru cache to dagservice (cache merkledag nodes in dagservice #2849)
    • in progress by @kevina, might not be worth the overhead
    • Reason: Does not provide provably significant improvements
  • go-ipld feature parity with js-ipld
    • Reason: we technically have this already
  • store peerstore information to disk (Store peerstore data to disk #2848)
    • not a significant amount of memory usage compared to other efforts, dropping priority

Misc Extra Issues we'd like done

If anyone would like to help out, just pick up one of the issues, start
hacking, and feel free to let us know any questions you have.

@whyrusleeping
Copy link
Member Author

whyrusleeping commented Jun 21, 2016

If anyone wants a chance to take ownership of something, the go-ipfs-api repo needs some TLC.

Namely:

@whyrusleeping
Copy link
Member Author

A short update

Hey Everyone,
We're just wrapping up our planning for the next few months of work. I'll write an update about that soon once i synthesize it all into a better schedule. For now, a quick update on the 0.4.3 release. The first release candidate has been pushed out and is available to try out ( you can download it here ). We have found two bugs in it that are in the process of being resolved, once they get merged we will ship a second release candidate which will ( barring any unforseen issues ) will become the 0.4.3 release.

Thank you so much to everyone who has helped us get to where we are for this release, we're very grateful for all the support!

@whyrusleeping
Copy link
Member Author

IPFS 0.4.3 and IPLD

Welcome back to another thrilling installment of the go-ipfs captains log. This time around I've got some news about our short and mid-term goals for the next few months. But before I get to that, I want to thank all of our contributors who have contributed to this release. This is going to be the best release of ipfs yet, with performance and stability improvements across the board (not to mention some nifty new features ). The changelog is gigantic.

Ipfs v0.4.3 is almost out the door. We're going to push a third release candidate out soon that fixes a bug caused by the stdin changes. And when go1.7 finally releases we're going to build the official v0.4.3 binaries with go1.7. We're really excited for this for a number of reasons, go1.7 brings a large number of performance improvements across the board. It also adds support for macOS Sierra, running current OSX binaries on Sierra will result in instability issues due to changes in the operating system.

Next up, we have a couple big milestones we're planning on working towards. The biggest of which is IPLD integration into go-ipfs. Its been a long road to this, but we're finally making it a major development priority in the go codebase. The current roadmap for accomplishing this looks roughly like:

There will be more parts to accomplishing that, and as we discover them they will be added to the ipld milestone. Full IPLD integration will ship in ipfs v0.5.0.

The next big milestone we are working towards is interoperability with js-ipfs. We're already very close on this one, and just need a few changes on that end to get things working smoothly. You can track that here, on the js-ipfs interop milestone, but the main blockers are:

Our third big push in the coming months is reliability and resource consumption. AKA, making ipfs a better citizen of your computer. Towards this, we're working to improve test coverage on all of our packages, reducing the amount of memory, bandwidth, CPU and disk space that ipfs consumes (or is allowed to consume), and putting more stress testing on ipfs so that we can ship releases with more confidence.

I have opened an issue in go-ipfs to track the progress towards more complete code coverage there, and have also opened a similar one in libp2p. Improving code coverage is the easiest way to help out and get your hands dirty with the codebase. There is a short how-to guide on checking the level of coverage here.

For the resource consumption, we have a milestone to track all of the related issues. Getting these all closed will be a HUGE deal. People have been wanting a way to limit the bandwidth usage of ipfs for quite some time, so that is going to be the biggest aim on this milestone. But it's not the only focus, lowering the memory footprint will be killed for embedded applications of ipfs. Adding limits to the storage of the repo will make it easier to run nodes in shared environments so you won't have to worry about destroying your disks. And reducing the baseline CPU usage of ipfs will bring mobile ipfs that much closer.

Beyond those milestones there is still a lot of work to be done. If you're a new contributor and looking to get involved, or even a current contributor just looking for something to do, take a look at the help wanted label in our issue tracker and see if anything piques your interest. A few curated issues that I think would be pretty good places to start are:

Aside from go-ipfs, we have a bunch of other exciting stuff going on. In no particular order:

  • We now have a libp2p organization for all the libp2p related repositories to live in. All our existing libp2p code will be slowly migrating over to it.
  • We also now have a multiformats organization where all of the 'multi' code is going to stay. Having all of this
  • We have started down the path of a fully peer to peer distributed pubsub system built on top of libp2p.

Thanks for reading, and Happy Hacking!

@whyrusleeping
Copy link
Member Author

whyrusleeping commented Aug 23, 2016

Update after a long day at sea

I made a pass through all 500 of go-ipfs's open issues and tried to start getting a handle on them. I started by trying to make sure each issue was either marked 'help_wanted', meaning that its actionable. 'discussion' meaning that nothing is immediately actionable, but that people should read the issue and potentially weigh in. 'verify' meaning that the issue is reporting a problem that may or may not be resolved already, and finally, closing most of the rest if they are resolved.

How you can help

Now that this 'quick' pass has been done, theres lots to do. One very helpful thing would be to take a look at all the issues that need verification, seeing if the problem reported has been resolved or not, and then commenting that back in the issue. This will help us either close more issues, or generate more actionable items.

There are also now a pretty significant number of issues marked 'help wanted' (137 total at the time of writing). These issues are things that should be actionable and that we need some help with. Take a look through that list and see if anything looks interesting to you, then claim it and start hacking.

Whats next?

Sometime in the next few weeks i'm going to go through the entire list again and try to further refine the tagging and titles on issues to clean things up. The more people help out in closing out issues, the easier that whole thing is gonna be.

Thanks for reading! Happy Hacking

@em-ly em-ly added the need/community-input Needs input from the wider community label Aug 25, 2016
@whyrusleeping
Copy link
Member Author

whyrusleeping commented Jul 3, 2017

Ipfs Development Update (July 2017)

Its been a while since we had a good update on the direction of ipfs
development. For the rest of this year, we will be focused on making ipfs
handle large datasets with ease. More specifically, we want to improve the
performance of the on-disk storage engine that ipfs uses. Currently, we use a
datastore called flatfs, that stores each block in its own file on disk,
sharding files into subdirectories based on their hash. Using the filesystem as
a database like this turns out to not be that fast (the cost of fsync dominates
performance). We are working towards giving the user the ability to specify how
ipfs should store their data, and give several options that will work well for
different workloads (badger, sql, leveldb, and more). One goal here is to have
the time it takes to add content to ipfs to be well within an order of
magnitude of the time it takes to hash everything and copy it to another
location on disk.

Past that, we want to make sure that transferring those large files is
efficient. We are working on various optimizations and features for bitswap to
accomplish this. If successful, a direct transfer of a large file between two
peers will produce very minimal network chatter, and transferring files between
a set of peers will result in fewer (ideally close to zero) duplicated blocks
being sent.

Once we have accomplished that (or perhaps concurrently) we will push for
better connection management in libp2p. In particular, the ability to limit the
number of connections an ipfs node maintains, and the ability to set options
that govern bandwidth usage and network activity. More details on this goal are
documented here.

We are also working on improving our pubsub. Work is starting towards a
more efficient message routing algorithm, and we are also working on a way
to propogate ipns entries via pubsub as a way to greatly improve the latency
of ipns operations.

In addition to those goals, we have a somewhat separate push towards more ipld
integrations. This involves making it easier to plug in new formats (such as
git, ethereum and zcash) as well as improving the general performance of path
traversals and object operations via the ipfs dag API. Work towards adding a
plugin system (currently linux only) will remove
the need to recompile to add support for new object types, and blockchain
integration work will allow users to build block explorers and lite clients directly
on top of ipfs.

As always, If you want to help out, there are many ways to participate. Check out help wanted issues, or drop by IRC (#ipfs on freenode) and say hi! See our contributing doc for more info.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
need/community-input Needs input from the wider community
Projects
None yet
Development

No branches or pull requests

5 participants