-
Notifications
You must be signed in to change notification settings - Fork 3
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
13e19d1
commit a9b3680
Showing
7 changed files
with
218 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
src/tests/data/* linguist-vendored=false |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
--- | ||
name: Bug report | ||
about: Create a report to help us fix something bad like an exception | ||
title: "[BUG]" | ||
labels: bug, help wanted | ||
assignees: '' | ||
|
||
--- | ||
|
||
**Describe the bug** | ||
A clear and concise description of what the bug/exception is. | ||
|
||
**To Reproduce** | ||
Give us the target text file. Host it somewhere with untouched encoding. | ||
|
||
**Expected behavior** | ||
A clear and concise description of what you expected to happen. | ||
|
||
**Logs** | ||
If applicable, add console outputs to help explain your problem. | ||
|
||
**Desktop (please complete the following information):** | ||
- OS: [e.g. Linux, Windows or Mac] | ||
- Rust version [e.g. 1.7] | ||
- Package version [eg. 1.0.0] | ||
|
||
**Additional context** | ||
Add any other context about the problem here. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
--- | ||
name: Feature request | ||
about: Suggest an idea for this project | ||
title: "[Proposal]" | ||
labels: enhancement | ||
assignees: '' | ||
|
||
--- | ||
|
||
**Is your feature request related to a problem? Please describe.** | ||
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...] | ||
|
||
**Describe the solution you'd like** | ||
A clear and concise description of what you want to happen. | ||
|
||
**Describe alternatives you've considered** | ||
A clear and concise description of any alternative solutions or features you've considered. | ||
|
||
**Additional context** | ||
Add any other context or screenshots about the feature request here. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,39 @@ | ||
--- | ||
name: Wrong charset / Detection issue | ||
about: Create a report to help us improve the detection mechanism | ||
title: "[DETECTION]" | ||
labels: help wanted, detection | ||
assignees: '' | ||
|
||
--- | ||
|
||
**Notice** | ||
I hereby announce that my raw input is not : | ||
- Too small content (<=32 characters) as I do know that ANY charset detector heavily depends on content | ||
- Encoded in a deprecated/abandoned encoding that is not even supported by encoding Rust library | ||
|
||
**Provide the file** | ||
A accessible way of retrieving the file concerned. Host it somewhere with untouched encoding. | ||
|
||
**Verbose output** | ||
Using the CLI, run `normalizer -v ./my-file.txt` and past the result in here. | ||
|
||
``` | ||
(venv) >normalizer -v ./data/sample.1.ar.srt | ||
2021-05-21 08:38:44,050 | DEBUG | ascii does not fit given bytes sequence at ALL. 'ascii' codec can't decode byte 0xca in position 54: ordinal not in range(128) | ||
2021-05-21 08:38:44,051 | DEBUG | big5 does not fit given bytes sequence at ALL. 'big5' codec can't decode byte 0xc9 in position 60: illegal multibyte sequence | ||
2021-05-21 08:38:44,051 | DEBUG | big5hkscs does not fit given bytes sequence at ALL. 'big5hkscs' codec can't decode byte 0xc9 in position 60: illegal multibyte sequence | ||
.... | ||
``` | ||
|
||
**Expected encoding** | ||
A clear and concise description of what you expected as encoding. Any more details about how the current guess is wrong | ||
is very much appreciated. | ||
|
||
**Desktop (please complete the following information):** | ||
- OS: [e.g. Linux, Windows or Mac] | ||
- Python version [e.g. 1.7] | ||
- Package version [eg. 1.0.0] | ||
|
||
**Additional context** | ||
Add any other context about the problem here. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,131 @@ | ||
/target | ||
.idea/ | ||
.DS_Store | ||
|
||
# Byte-compiled / optimized / DLL files | ||
__pycache__/ | ||
*.py[cod] | ||
*$py.class | ||
|
||
# C extensions | ||
*.so | ||
|
||
# Distribution / packaging | ||
.Python | ||
build/ | ||
develop-eggs/ | ||
dist/ | ||
downloads/ | ||
eggs/ | ||
.eggs/ | ||
lib/ | ||
lib64/ | ||
parts/ | ||
sdist/ | ||
var/ | ||
wheels/ | ||
pip-wheel-metadata/ | ||
share/python-wheels/ | ||
*.egg-info/ | ||
.installed.cfg | ||
*.egg | ||
MANIFEST | ||
|
||
# PyInstaller | ||
# Usually these files are written by a python script from a template | ||
# before PyInstaller builds the exe, so as to inject date/other infos into it. | ||
*.manifest | ||
*.spec | ||
|
||
# Installer logs | ||
pip-log.txt | ||
pip-delete-this-directory.txt | ||
|
||
# Unit test / coverage reports | ||
htmlcov/ | ||
.tox/ | ||
.nox/ | ||
.coverage | ||
.coverage.* | ||
.cache | ||
nosetests.xml | ||
coverage.xml | ||
*.cover | ||
.hypothesis/ | ||
.pytest_cache/ | ||
|
||
# Translations | ||
*.mo | ||
*.pot | ||
|
||
# Django stuff: | ||
*.log | ||
local_settings.py | ||
db.sqlite3 | ||
db.sqlite3-journal | ||
|
||
# Flask stuff: | ||
instance/ | ||
.webassets-cache | ||
|
||
# Scrapy stuff: | ||
.scrapy | ||
|
||
# Sphinx documentation | ||
docs/_build/ | ||
|
||
# PyBuilder | ||
target/ | ||
|
||
# Jupyter Notebook | ||
.ipynb_checkpoints | ||
|
||
# IPython | ||
profile_default/ | ||
ipython_config.py | ||
|
||
# pyenv | ||
.python-version | ||
|
||
# pipenv | ||
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. | ||
# However, in case of collaboration, if having platform-specific dependencies or dependencies | ||
# having no cross-platform support, pipenv may install dependencies that don't work, or not | ||
# install all needed dependencies. | ||
#Pipfile.lock | ||
|
||
# celery beat schedule file | ||
celerybeat-schedule | ||
|
||
# SageMath parsed files | ||
*.sage.py | ||
|
||
# Environments | ||
.env | ||
.venv | ||
env/ | ||
venv/ | ||
ENV/ | ||
env.bak/ | ||
venv.bak/ | ||
|
||
# Spyder project settings | ||
.spyderproject | ||
.spyproject | ||
|
||
# Rope project settings | ||
.ropeproject | ||
|
||
# mkdocs documentation | ||
/site | ||
|
||
# mypy | ||
.mypy_cache/ | ||
.dmypy.json | ||
dmypy.json | ||
|
||
# Pyre type checker | ||
.pyre/ | ||
|
||
.idea/ | ||
char-dataset/ |
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
[package] | ||
name = "charset-normalizer-rs" | ||
version = "0.1.0" | ||
version = "1.0.0" | ||
edition = "2021" | ||
|
||
[dependencies] | ||
|