Skip to content

Commit

Permalink
Merge pull request CalebQ42#28 from CalebQ42/exp1
Browse files Browse the repository at this point in the history
Separation
  • Loading branch information
CalebQ42 authored Dec 28, 2023
2 parents fcd8c4c + ef72408 commit e9de9e6
Show file tree
Hide file tree
Showing 53 changed files with 1,976 additions and 2,296 deletions.
19 changes: 14 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,27 +2,36 @@

[![PkgGoDev](https://pkg.go.dev/badge/github.com/CalebQ42/squashfs)](https://pkg.go.dev/github.com/CalebQ42/squashfs) [![Go Report Card](https://goreportcard.com/badge/github.com/CalebQ42/squashfs)](https://goreportcard.com/report/github.com/CalebQ42/squashfs)

A PURE Go library to read squashfs. There is currently no plans to add archive creation support as it will almost always be better to just call `mksquashfs`. I could see some possible use cases, but probably won't spend time on it unless it's requested (open a discussion fi you want this feature).
A PURE Go library to read squashfs. There is currently no plans to add archive creation support as it will almost always be better to just call `mksquashfs`. I could see some possible use cases, but probably won't spend time on it unless it's requested (open a discussion if you want this feature).

The library has two parts with this `github.com/CalebQ42/squashfs` being easy to use as it implements `io/fs` interfaces and doesn't expose unnecessary information. 95% this is the library you want. If you need lower level access to the information, use `github.com/CalebQ42/squashfs/low` where far more information is exposed.

Currently has support for reading squashfs files and extracting files and folders.

Special thanks to <https://dr-emann.github.io/squashfs/> for some VERY important information in an easy to understand format.
Thanks also to [distri's squashfs library](https://github.com/distr1/distri/tree/master/internal/squashfs) as I referenced it to figure some things out (and double check others).

## [TODO](https://github.com/CalebQ42/squashfs/projects/1?fullscreen=true)
## FUSE

As of `v1.0`, FUSE capabilities has been moved to [a separate library](https://github.com/CalebQ42/squashfuse).

## Limitations

* No Xattr parsing. This is simply because I haven't done any research on it and how to apply these in a pure go way.
* No Xattr parsing.
* Socket files are not extracted.
* From my research, it seems like a socket file would be useless if it could be created.
* Fifo files are ignored on `darwin`

## Issues

* Significantly slower then `unsquashfs` when extracting folders (about 5 ~ 7 times slower on a ~100MB archive using zstd compression)
* Significantly slower then `unsquashfs` when extracting folders
* This seems to be related to above along with the general optimization of `unsquashfs` and it's compression libraries.
* The larger the file's tree, the slower the extraction will be. Arch Linux's Live USB's airootfs.sfs takes ~35x longer for a full extraction.
* Times seem to be largely dependent on file tree size and compression type.
* My main testing image (~100MB) using Zstd takes about 6x longer.
* An Arch Linux airootfs image (~780MB) using XZ compression with LZMA filters takes about 32x longer.
* A Tensorflow docker image (~3.3GB) using Zstd takes about 12x longer.

Note: These numbers are using `FastOptions()`. `DefaultOptions()` takes about 2x longer.

## Recommendations on Usage

Expand Down
49 changes: 49 additions & 0 deletions extraction_options.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
package squashfs

import (
"io"
"io/fs"
"runtime"

"github.com/CalebQ42/squashfs/internal/routinemanager"
)

type ExtractionOptions struct {
manager *routinemanager.Manager
LogOutput io.Writer //Where the verbose log should write.
DereferenceSymlink bool //Replace symlinks with the target file.
UnbreakSymlink bool //Try to make sure symlinks remain unbroken when extracted, without changing the symlink.
Verbose bool //Prints extra info to log on an error.
IgnorePerm bool //Ignore file's permissions and instead use Perm.
Perm fs.FileMode //Permission to use when IgnorePerm. Defaults to 0777.
SimultaneousFiles uint16 //Number of files to process in parallel. Default set based on runtime.NumCPU().
ExtractionRoutines uint16 //Number of goroutines to use for each file's extraction. Only applies to regular files. Default set based on runtime.NumCPU().
}

// The default extraction options.
func DefaultOptions() *ExtractionOptions {
cores := uint16(runtime.NumCPU() / 2)
var files, routines uint16
if cores <= 4 {
files = 1
routines = cores
} else {
files = cores - 4
routines = 4
}
return &ExtractionOptions{
Perm: 0777,
SimultaneousFiles: files,
ExtractionRoutines: routines,
}
}

// Less limited default options. Can run up 2x faster than DefaultOptions.
// Tends to use all available CPU resources.
func FastOptions() *ExtractionOptions {
return &ExtractionOptions{
Perm: 0777,
SimultaneousFiles: uint16(runtime.NumCPU()),
ExtractionRoutines: uint16(runtime.NumCPU()),
}
}
Loading

0 comments on commit e9de9e6

Please sign in to comment.