Skip to content

Commit

Permalink
Merge pull request emscripten-core#1 from rstz/pthreadfs-initial
Browse files Browse the repository at this point in the history
Add PThreadFS folder
  • Loading branch information
rstz authored Aug 6, 2021
2 parents 071b8b1 + 6009917 commit 90f5f03
Show file tree
Hide file tree
Showing 35 changed files with 10,334 additions and 0 deletions.
6 changes: 6 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -23,3 +23,9 @@ htmlcov/
coverage.xml

.DS_Store

# PThreadfs
pthreadfs/examples/cache
pthreadfs/examples/dist
pthreadfs/examples/out
pthreadfs/examples/sqlite-src
157 changes: 157 additions & 0 deletions pthreadfs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,157 @@
# PThreadFS

The Emscripten Pthread File System (PThreadFS) unlocks using (partly) asynchronous storage APIs such as [OPFS Access Handles](https://docs.google.com/document/d/121OZpRk7bKSF7qU3kQLqAEUVSNxqREnE98malHYwWec/edit#heading=h.gj2fudnvy982) through the Emscripten File System API. This enables C++ applications compiled through Emscripten to use persistent storage without using [Asyncify](https://emscripten.org/docs/porting/asyncify.html). PThreadFS requires only minimal modifications to the C++ code and nearly achieves feature parity with the Emscripten's classic File System API.

PThreadFS works by replacing Emscripten's file system API with a new API that proxies all file system operations to a dedicated pthread. This dedicated thread maintains a virtual file system that can use different APIs as backend (very similar to the way Emscripten's VFS is designed). In particular, PThreadFS comes with built-in support for asynchronous backends such as [OPFS Access Handles](https://docs.google.com/document/d/121OZpRk7bKSF7qU3kQLqAEUVSNxqREnE98malHYwWec/edit#heading=h.gj2fudnvy982).
Although the underlying storage API is asynchronous, PThreadFS makes it appear synchronous to the C++ application.

The code is still prototype quality and **should not be used in a production environment** for the time being.

## Enable and detect OPFS in Chrome

OPFS Access Handles require very recent versions of Google Chrome Canary. PThreadFS has been successfully tested with Version 94.0.4597.0.

To enable the API, the " --enable-features=FileSystemAccessAccessHandle" flag must be set when starting Chrome from the console. On MacOS, this can be done through
```
open -a /Applications/Google\ Chrome\ Canary.app --args --enable-features=FileSystemAccessAccessHandle
```

Support for AccesHandles in OPFS can be detected through
```
async function detectAccessHandleWorker() {
const root = await navigator.storage.getDirectory();
const file = await root.getFileHandle('access-handle-detect', { create: true });
const present = file.createSyncAccessHandle != undefined;
await root.removeEntry('access-handle-detect');
return present;
}
async function detectAccessHandleMainThread() {
const detectAccessHandleAndPostMessage = async function(){
const root = await navigator.storage.getDirectory();
const file = await root.getFileHandle('access-handle-detect', { create: true });
const present = file.createSyncAccessHandle != undefined;
await root.removeEntry('access-handle-detect');
postMessage(present);
};
return new Promise((resolve, reject) => {
const detectBlob = new Blob(['('+detectAccessHandleAndPostMessage.toString()+')()'], {type: 'text/javascript'})
const detectWorker = new Worker(window.URL.createObjectURL(detectBlob));
detectWorker.onmessage = result => {
resolve(result.data);
detectWorker.terminate();
};
detectWorker.onerror = error => {
reject(error);
detectWorker.terminate();
};
});
}
```

## Getting the code

PthreadFS is available on Github in the [emscripten-pthreadfs](https://github.com/rstz/emscripten-pthreadfs) repository. All code resides in the `pthreadfs` folder. It should be usable with any up-to-date Emscripten version.

There is **no need** to use a fork of Emscripten itself since all code operates in user-space.

## Using PThreadFS in a project

In order to use the code in a new project, you only need the three files in the `pthreadfs` folder: `pthreadfs_library.js`, `pthreadfs.cpp` and `pthreadfs.h`. The files are included as follows:

### Code changes

- Include `pthreadfs.h` in the C++ file containing `main()`:
```
#include "pthreadfs.h"
```
- Call `emscripten_init_pthreadfs();` at the top of `main()` (or before any file system syscalls).
- PThreadFS maintains a virtual file system. The OPFS backend is mounted at `/filesystemaccess/`. Only files in this folder are persisted between sessions. All other files will be stored in-memory through MEMFS.

### Build process changes

There are two changes required to build a project with PThreadFS:
- Compile `pthreadfs.h` and `pthreadfs.cpp` and link the resulting object to your application. Add `-pthread` to the compiler flag to include support for pthreads.
- Add the following options to the linking step:
```
-pthread -O3 -s PROXY_TO_PTHREAD --js-library=library_pthreadsfs.js
```
**Example**
If your build process was
```shell
emcc myproject.cpp -o myproject.html
```
Your new build step should be
```shell
emcc -pthread -s PROXY_TO_PTHREAD -O3 --js-library=library_pthreadfs.js myproject.cpp pthreadfs.cpp -o myproject.html
```

### Advanced Usage

If you want to modify the PThreadFS file system directly, you may use the macro `EM_PTHREADFS_ASM()` defined in `pthreadfs.h`. The macro allows you to run asynchrononous Javascript on the Pthread hosting the PThreadFS file system. For example, you may create a folder in the virtual file system by calling
```
EM_PTHREADFS_ASM(
await PThreadFS.mkdir('mydirectory');
);
```
See `pthreadfs/examples/emscripten-tests/` for exemplary usage.


## Known Limitations

- The code is still prototype quality and **should not be used in a production environment** yet. It is possible that the use of PThreadFS might lead to subtle bugs in other libraries.
- PThreadFS requires PROXY_TO_PTHREAD to be active. In particular, no system calls interacting with the file system should be called from the main thread.
- Some functionality of the Emscripten File System API is missing, such as sockets, IndexedDB integration and support for XHRequests.
- PThreadFS depends on C++ libraries. `EM_PTRHEADFS_ASM()` cannot be used within C files (although initializing through `emscripten_init_pthreadfs()` is possible, see the `pthreadfs/examples/sqlite-speedtest` for an example).
- Only in-memory storage (MEMFS) and OPFS Access Handles (FSAFS) are available as backends for PThreadFS.

There is no support (yet) for persisting data into IndexedDB (the way IDBFS works). Limited support is available for the Storage Foundation API as backend.
- Performance is good if and only if full optimizations (compiler option `-O3`) are enabled and DevTools are closed.
- Using stdout from C++ only prints to the Javascript console, not the Emscripten-generated html file.


## Examples

The examples are provided to show how projects can be transformed to use PThreadFS. To build them, navigate to the `pthreadfs/examples/` folder and run `make all`. You need to have the [Emscripten SDK](https://emscripten.org/docs/getting_started/downloads.html) activated for the build process to succeed.

### SQLite Speedtest

This example shows how to compile and run the [speedtest1](https://www.sqlite.org/cpu.html) from the SQLite project in the browser.

The Makefile downloads the source of the speedtest and sqlite3 directly from <https://sqlite.org>.

To compile, navigate to the `pthreadfs/examples/` directory and run

```shell
make sqlite-speedtest
cd dist/sqlite-speedtest
python3 -m http.server 8888
```
Then open the following link in a Chrome instance with the
_OPFS Access Handles_ [enabled](#enable-and-detect-opfs-in-chrome):

[localhost:8888/sqlite-speedtest](http://localhost:8888/sqlite-speedtest). The results of the speedtest can be found in the DevTools console.

### Other tests

The folder `pthreadfs/examples/emscripten-tests` contains a number of other file system tests taken from Emscripten's standard test suite.

To compile, navigate to the `pthreadfs/examples/` directory and run

```shell
make emscripten-tests
cd dist/emscripten-tests
python3 -m http.server 8888
```
Then open the following link in a Chrome instance with the
_OPFS Access Handles_ [enabled](#enable-and-detect-opfs-in-chrome):

[localhost:8888/emscripten-tests](http://localhost:8888/emscripten-tests) and choose a test. The results of the test can be found in the DevTools console.

## Authors
- Richard Stotz (<[email protected]>)

This is not an official Google product.
127 changes: 127 additions & 0 deletions pthreadfs/examples/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,127 @@
# Emscripten
EMCC = emcc

# Define some folders
EMTESTS = emscripten-tests
SQLITETESTS = sqlite-speedtest
OBJ = out/bc

PTHREADFS_JS = ../library_pthreadfs.js
PTHREADFS_H = ../pthreadfs.h
PTHREADFS_CPP = ../pthreadfs.cpp

# I got this handy makefile syntax from : https://github.com/mandel59/sqlite-wasm (MIT License) Credited in LICENSE
# To use another version of Sqlite, visit https://www.sqlite.org/download.html and copy the appropriate values here:
SQLITE_AMALGAMATION = sqlite-amalgamation-3350000
SQLITE_AMALGAMATION_ZIP_URL = https://www.sqlite.org/2021/sqlite-amalgamation-3350000.zip
SQLITE_AMALGAMATION_ZIP_SHA1 = ba64bad885c9f51df765a9624700747e7bf21b79
SQLITE_H = sqlite3.h
SQLITE_SPEEDTEST = speedtest1.c
SQLITE_SPEEDTEST_URL = https://sqlite.org/src/raw/5e5b805f24cc939656058f6a498f5a2160f9142e4815c54faf758ec798d4cdad?at=speedtest1.c
SQLITE_SPEEDTEST_SHA1 = f0edde2ad68f090e4676ac30042e0f6b765a8528

## cache
cache/$(SQLITE_AMALGAMATION).zip:
mkdir -p cache
curl -LsSf '$(SQLITE_AMALGAMATION_ZIP_URL)' -o $@

cache/$(SQLITE_SPEEDTEST):
mkdir -p cache
curl -LsSf '$(SQLITE_SPEEDTEST_URL)' -o $@


## sqlite-src
.PHONY: sqlite-src
sqlite-src: sqlite-src/$(SQLITE_AMALGAMATION) sqlite-src/$(SQLITE_SPEEDTEST)

sqlite-src/$(SQLITE_AMALGAMATION): cache/$(SQLITE_AMALGAMATION).zip sqlite-src/$(SQLITE_AMALGAMATION)/$(EXTENSION_FUNCTIONS)
mkdir -p sqlite-src/$(SQLITE_AMALGAMATION)
echo '$(SQLITE_AMALGAMATION_ZIP_SHA1) ./cache/$(SQLITE_AMALGAMATION).zip' > cache/check.txt
shasum -c cache/check.txt
# We don't delete the sqlite_amalgamation folder. That's a job for clean
# Also, the extension functions get copied here, and if we get the order of these steps wrong,
# this step could remove the extension functions, and that's not what we want
unzip -u 'cache/$(SQLITE_AMALGAMATION).zip' -d sqlite-src/
touch $@

sqlite-src/$(SQLITE_AMALGAMATION)/$(SQLITE_H): sqlite-src/$(SQLITE_AMALGAMATION)

sqlite-src/$(SQLITE_SPEEDTEST): cache/$(SQLITE_SPEEDTEST)
mkdir -p sqlite-src/$(SQLITE_AMALGAMATION)
echo '$(SQLITE_SPEEDTEST_SHA1) ./cache/$(SQLITE_SPEEDTEST)' > cache/check.txt
shasum -c cache/check.txt
cp 'cache/$(SQLITE_SPEEDTEST)' $@
echo 'extern void emscripten_init_pthreadfs();' > includeline.tmp
mv $@ input.tmp
cat includeline.tmp input.tmp > $@
rm input.tmp includeline.tmp
sed -i '' -e '/int main(/a\'$$'\n''emscripten_init_pthreadfs();' $@


# build options
OPTIMIZATION_LEVEL = O3

CFLAGS = \
-$(OPTIMIZATION_LEVEL) \
-Wall \
-pthread \
-I..

CFLAGS_SQLITE = \
$(CFLAGS) \
-DSQLITE_ENABLE_MEMSYS5 \
-D_HAVE_SQLITE_CONFIG_H \
-DSQLITE_OMIT_LOAD_EXTENSION \
-DSQLITE_DISABLE_LFS \
-DSQLITE_THREADSAFE=0 \
-DSQLITE_ENABLE_NORMALIZE \
-Isqlite-src/$(SQLITE_AMALGAMATION)/

LINK_FLAGS = \
-pthread \
-s PROXY_TO_PTHREAD \
-s INITIAL_MEMORY=134217728

.PHONY: all
all: sqlite-speedtest emscripten-tests

.PHONY: clean
clean:
rm -rf dist/*
rm -rf out/
rm -f cache/*
rm -rf sqlite-src/

$(OBJ)/sqlite3.o: sqlite-src/$(SQLITE_AMALGAMATION)
mkdir -p $(OBJ)
$(EMCC) $(CFLAGS) -c sqlite-src/$(SQLITE_AMALGAMATION)/sqlite3.c -o $@

$(OBJ)/speedtest1.o: sqlite-src/$(SQLITE_SPEEDTEST) sqlite-src/$(SQLITE_AMALGAMATION)
mkdir -p $(OBJ)
$(EMCC) $(CFLAGS) -Isqlite-src/$(SQLITE_AMALGAMATION)/ -c $< -o $@

$(OBJ)/pthreadfs.o : $(PTHREADFS_CPP)
mkdir -p $(OBJ)
$(EMCC) -c $(CFLAGS) $< -o $@

$(OBJ)/%.o : $(EMTESTS)/%.cpp
mkdir -p $(OBJ)
$(EMCC) -c $(CFLAGS) $< -o $@

# Don't delete my precious object files
.PRECIOUS: $(OBJ)/%.out

.PHONY: sqlite-speedtest
sqlite-speedtest: dist/sqlite-speedtest/index.html
dist/sqlite-speedtest/index.html: $(OBJ)/speedtest1.o $(OBJ)/sqlite3.o $(OBJ)/pthreadfs.o $(PTHREADFS_JS) $(SQLITETESTS)/sqlite-speedtest-prejs.js
mkdir -p dist/sqlite-speedtest
$(EMCC) $(LINK_FLAGS) --pre-js=$(SQLITETESTS)/sqlite-speedtest-prejs.js --js-library=$(PTHREADFS_JS) $< $(word 2,$^) $(word 3,$^) -o $@

.PHONY: emscripten-tests
emscripten-tests: $(addprefix dist/, $(addsuffix .html, $(basename $(wildcard $(EMTESTS)/*.cpp))))
@echo 'building emscripten-tests' $?

dist/$(EMTESTS)/%.html : $(OBJ)/%.o $(OBJ)/pthreadfs.o $(PTHREADFS_JS)
mkdir -p dist/$(EMTESTS)
$(EMCC) $(LINK_FLAGS) --js-library=$(PTHREADFS_JS) $< $(word 2,$^) -o $@

58 changes: 58 additions & 0 deletions pthreadfs/examples/emscripten-tests/access.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
/*
* Copyright 2011 The Emscripten Authors. All rights reserved.
* Emscripten is available under two separate licenses, the MIT license and the
* University of Illinois/NCSA Open Source License. Both these licenses can be
* found in the LICENSE file.
*/

#include <stdio.h>
#include <errno.h>
#include <unistd.h>
#include <emscripten.h>
#include <fcntl.h>
#include <sys/stat.h>
#include "pthreadfs.h"

int main() {
emscripten_init_pthreadfs();
EM_PTHREADFS_ASM(
await PThreadFS.mkdir('working');
await PThreadFS.chdir('working');
await PThreadFS.writeFile('forbidden', ""); await PThreadFS.chmod('forbidden', 0o000);
await PThreadFS.writeFile('readable', ""); await PThreadFS.chmod('readable', 0o444);
await PThreadFS.writeFile('writeable', ""); await PThreadFS.chmod('writeable', 0o222);
await PThreadFS.writeFile('allaccess', ""); await PThreadFS.chmod('allaccess', 0o777);
);
// Empty path checks #9136 fix
char* files[] = {"readable", "writeable",
"allaccess", "forbidden", "nonexistent", ""};
for (int i = 0; i < sizeof files / sizeof files[0]; i++) {
printf("F_OK(%s): %d\n", files[i], access(files[i], F_OK));
printf("errno: %d\n", errno);
errno = 0;
printf("R_OK(%s): %d\n", files[i], access(files[i], R_OK));
printf("errno: %d\n", errno);
errno = 0;
printf("X_OK(%s): %d\n", files[i], access(files[i], X_OK));
printf("errno: %d\n", errno);
errno = 0;
printf("W_OK(%s): %d\n", files[i], access(files[i], W_OK));
printf("errno: %d\n", errno);
errno = 0;
printf("\n");
}

EM_PTHREADFS_ASM(
await PThreadFS.writeFile('filetorename', 'renametest');
);

rename("filetorename", "renamedfile");

errno = 0;
printf("F_OK(%s): %d\n", "filetorename", access("filetorename", F_OK));
printf("errno: %d\n", errno);
errno = 0;
printf("F_OK(%s): %d\n", "renamedfile", access("renamedfile", F_OK));
printf("errno: %d\n", errno);
return 0;
}
Loading

0 comments on commit 90f5f03

Please sign in to comment.